Vision-language-action models (VLAs) show potential as generalist robot policies. However, these models pose extreme safety challenges during real-world deployment, including the risk of harm to the environment, the robot itself, and humans. How can safety constraints be explicitly integrated into VLAs? We address this by exploring an integrated safety approach (ISA), systematically modeling safety requirements, then actively eliciting diverse unsafe behaviors, effectively constraining VLA policies via safe reinforcement learning, and rigorously assuring their safety through targeted evaluations. Leveraging the constrained Markov decision process (CMDP) paradigm, ISA optimizes VLAs from a min-max perspective against elicited safety risks. Thus, policies aligned through this comprehensive approach achieve the following key features: (I) effective safety-performance trade-offs, reducing the cumulative cost of safety violations by 83.58% compared to the state-of-the-art method, while also maintaining task success rate (+3.85%). (II) strong safety assurance, with the ability to mitigate long-tail risks and handle extreme failure scenarios. (III) robust generalization of learned safety behaviors to various out-of-distribution perturbations. The effectiveness is evaluated on long-horizon mobile manipulation tasks. Our data, models and newly proposed benchmark environment are available at https://pku-safevla.github.io.
翻译:视觉-语言-动作模型(VLA)展现出作为通用机器人策略的潜力。然而,这些模型在实际部署中面临严峻的安全挑战,包括对环境、机器人自身及人类造成伤害的风险。如何将安全约束明确整合到VLA中?我们通过探索集成安全方法(ISA)来解决这一问题:系统建模安全需求,主动激发多样化不安全行为,通过安全强化学习有效约束VLA策略,并借助针对性评估严格保障其安全性。基于约束马尔可夫决策过程(CMDP)范式,ISA从最小化最大风险的角度优化VLA以应对激发的安全威胁。因此,通过这种综合方法对齐的策略具备以下关键特性:(I)实现安全与性能的有效权衡,与最先进方法相比,安全违规累积成本降低83.58%,同时保持任务成功率(+3.85%)。(II)具备强大的安全保障能力,能够缓解长尾风险并处理极端故障场景。(III)学习到的安全行为对多种分布外扰动具有鲁棒泛化性。该方法的有效性在长时程移动操作任务中得到验证。我们的数据、模型及新提出的基准环境发布于 https://pku-safevla.github.io。