自主规划空间装配强化学习自由飞行器（APIARY）国际空间站Astrobee测试 (Autonomous Planning In-space Assembly Reinforcement-learning free-flYer (APIARY) International Space Station Astrobee Testing)

The US Naval Research Laboratory's (NRL's) Autonomous Planning In-space Assembly Reinforcement-learning free-flYer (APIARY) experiment pioneers the use of reinforcement learning (RL) for control of free-flying robots in the zero-gravity (zero-G) environment of space. On Tuesday, May 27th 2025 the APIARY team conducted the first ever, to our knowledge, RL control of a free-flyer in space using the NASA Astrobee robot on-board the International Space Station (ISS). A robust 6-degrees of freedom (DOF) control policy was trained using an actor-critic Proximal Policy Optimization (PPO) network within the NVIDIA Isaac Lab simulation environment, randomizing over goal poses and mass distributions to enhance robustness. This paper details the simulation testing, ground testing, and flight validation of this experiment. This on-orbit demonstration validates the transformative potential of RL for improving robotic autonomy, enabling rapid development and deployment (in minutes to hours) of tailored behaviors for space exploration, logistics, and real-time mission needs.

翻译：美国海军研究实验室（NRL）的自主规划空间装配强化学习自由飞行器（APIARY）实验开创性地将强化学习（RL）应用于空间零重力（zero-G）环境下自由飞行机器人的控制。据我们所知，APIARY团队于2025年5月27日（星期二）首次在国际空间站（ISS）上利用NASA Astrobee机器人实现了空间自由飞行器的强化学习控制。研究采用演员-评论家近端策略优化（PPO）网络在NVIDIA Isaac Lab仿真环境中训练了鲁棒的六自由度（DOF）控制策略，通过随机化目标位姿与质量分布以增强鲁棒性。本文详细阐述了该实验的仿真测试、地面测试及在轨验证。此次在轨演示验证了强化学习在提升机器人自主性方面的变革潜力，能够为空间探索、物流保障及实时任务需求快速（数分钟至数小时）开发并部署定制化行为。