Fall recovery is a critical skill for humanoid robots in dynamic environments such as RoboCup, where prolonged downtime often decides the match. Recent techniques using deep reinforcement learning (DRL) have produced robust get-up behaviors, yet existing methods require training of separate policies for each robot morphology. This paper presents a single DRL policy capable of recovering from falls across seven humanoid robots with diverse heights (0.48 - 0.81 m), weights (2.8 - 7.9 kg), and dynamics. Trained with CrossQ, the unified policy transfers zero-shot up to 86 +/- 7% (95% CI [81, 89]) on unseen morphologies, eliminating the need for robot-specific training. Comprehensive leave-one-out experiments, morph scaling analysis, and diversity ablations show that targeted morphological coverage improves zero-shot generalization. In some cases, the shared policy even surpasses the specialist baselines. These findings illustrate the practicality of morphology-agnostic control for fall recovery, laying the foundation for generalist humanoid control. The software is open-source and available at: https://github.com/utra-robosoccer/unified-humanoid-getup
翻译:在动态环境(如RoboCup)中,跌倒恢复是人形机器人的关键技能,长时间的停机往往决定比赛胜负。近期利用深度强化学习(DRL)的技术已产生稳健的起身行为,但现有方法需为每种机器人形态单独训练策略。本文提出一种单一DRL策略,能够使七种具有不同高度(0.48-0.81米)、重量(2.8-7.9千克)和动力学特性的人形机器人从跌倒中恢复。该统一策略通过CrossQ训练,在未见形态上实现高达86±7%(95%置信区间[81, 89])的零样本迁移,无需针对特定机器人进行训练。全面的留一实验、形态缩放分析和多样性消融研究表明,有针对性的形态覆盖可提升零样本泛化能力。在某些情况下,共享策略甚至超越了专用基线模型。这些发现证明了形态无关控制在跌倒恢复中的实用性,为通用人形机器人控制奠定了基础。软件已开源,访问地址:https://github.com/utra-robosoccer/unified-humanoid-getup