Whole-body humanoid motion represents a cornerstone challenge in robotics, integrating balance, coordination, and adaptability to enable human-like behaviors. However, existing methods typically require multiple training samples per motion category, rendering the collection of high-quality human motion datasets both labor-intensive and costly. To address this, we propose a novel approach that trains effective humanoid motion policies using only a single non-walking target motion sample alongside readily available walking motions. The core idea lies in leveraging order-preserving optimal transport to compute distances between walking and non-walking sequences, followed by interpolation along geodesics to generate new intermediate pose skeletons, which are then optimized for collision-free configurations and retargeted to the humanoid before integration into a simulated environment for policy training via reinforcement learning. Experimental evaluations on the CMU MoCap dataset demonstrate that our method consistently outperforms baselines, achieving superior performance across metrics. Code will be released upon acceptance.
翻译:人形机器人全身运动是机器人学中的核心挑战,它集平衡性、协调性和适应性于一体,以实现类人行为。然而,现有方法通常需要每个运动类别提供多个训练样本,这使得高质量人体运动数据集的采集既费力又昂贵。为解决这一问题,我们提出一种新颖方法,仅使用单个非行走目标运动样本及易于获取的行走运动,即可训练出有效的人形机器人运动策略。其核心思想在于利用保序最优传输计算行走与非行走序列间的距离,随后沿测地线进行插值以生成新的中间姿态骨架,接着对这些骨架进行无碰撞配置优化并重定向至人形机器人模型,最后将其集成到仿真环境中,通过强化学习进行策略训练。在CMU动作捕捉数据集上的实验评估表明,我们的方法始终优于基线模型,在各项指标上均取得了更优的性能。代码将在论文录用后公开。