Humanoid robots have received significant research interests and advancements in recent years. Despite many successes, due to their morphology, dynamics and limitation of control policy, humanoid robots are prone to fall as compared to other embodiments like quadruped or wheeled robots. And its large weight, tall Center of Mass, high Degree-of-Freedom would cause serious hardware damages when falling uncontrolled, to both itself and surrounding objects. Existing researches in this field mostly focus on using control based methods that struggle to cater diverse falling scenarios and may introduce unsuitable human prior. On the other hand, large-scale Deep Reinforcement Learning and Curriculum Learning could be employed to incentivize humanoid agent discovering falling protection policy that fits its own nature and property. In this work, with carefully designed reward functions and domain diversification curriculum, we successfully train humanoid agent to explore falling protection behaviors and discover that by forming a `triangle' structure, the falling damages could be significantly reduced with its rigid-material body. With comprehensive metrics and experiments, we quantify its performance with comparison to other methods, visualize its falling behaviors and successfully transfer it to real world platform.
翻译:近年来,人形机器人受到了广泛的研究关注并取得了显著进展。尽管在许多方面取得了成功,但由于其形态结构、动力学特性以及控制策略的限制,与四足或轮式机器人等其他形态相比,人形机器人更容易发生摔倒。其较大的重量、较高的质心位置以及高自由度特性,在不受控摔倒时会对机器人自身及周围物体造成严重的硬件损伤。该领域的现有研究大多集中于基于控制的方法,这些方法难以适应多样化的摔倒场景,并可能引入不恰当的人类先验知识。另一方面,大规模深度强化学习与课程学习可用于激励人形智能体发现适合其自身特性和属性的摔倒保护策略。在本工作中,通过精心设计的奖励函数和领域多样化课程,我们成功训练人形智能体探索摔倒保护行为,并发现通过形成‘三角形’结构,其刚性材料机身可显著降低摔倒损伤。通过全面的评估指标与实验,我们量化了其性能并与其它方法进行比较,可视化其摔倒行为,并成功将其迁移至真实世界平台。