Embodied agents face a fundamental limitation: once deployed in real-world environments to perform specific tasks, they are unable to acquire additional knowledge to enhance task performance. In this paper, we propose a general post-deployment learning framework Dejavu, which employs an Experience Feedback Network (EFN) and augments the frozen Vision-Language-Action (VLA) policy with retrieved execution memories. EFN identifies contextually prior action experiences and conditions action prediction on this retrieved guidance. We adopt reinforcement learning with semantic similarity rewards to train EFN, ensuring that the predicted actions align with past behaviors under current observations. During deployment, EFN continually enriches its memory with new trajectories, enabling the agent to exhibit "learning from experience". Experiments across diverse embodied tasks show that EFN improves adaptability, robustness, and success rates over frozen baselines. We provide code and demo in our supplementary material.
翻译:具身智能体面临一个根本性局限:一旦部署于真实环境执行特定任务,便无法获取额外知识以提升任务性能。本文提出一种通用的部署后学习框架Dejavu,该框架采用经验反馈网络(EFN),通过检索执行记忆来增强冻结的视觉-语言-动作(VLA)策略。EFN能够识别情境相关的历史动作经验,并基于检索到的指导信息调节动作预测。我们采用带有语义相似度奖励的强化学习来训练EFN,确保预测动作在当前观测条件下与过往行为保持一致。在部署过程中,EFN持续将新轨迹存入记忆库,使智能体能够实现“从经验中学习”。跨多种具身任务的实验表明,相较于冻结基线模型,EFN显著提升了适应性、鲁棒性与任务成功率。相关代码与演示已发布于补充材料中。