Social navigation in densely populated dynamic environments poses a significant challenge for autonomous mobile robots, requiring advanced strategies for safe interaction. Existing reinforcement learning (RL)-based methods require over 2000+ hours of extensive training and often struggle to generalize to unfamiliar environments without additional fine-tuning, limiting their practical application in real-world scenarios. To address these limitations, we propose SocialNav-Map, a novel zero-shot social navigation framework that combines dynamic human trajectory prediction with occupancy mapping, enabling safe and efficient navigation without the need for environment-specific training. Specifically, SocialNav-Map first transforms the task goal position into the constructed map coordinate system. Subsequently, it creates a dynamic occupancy map that incorporates predicted human movements as dynamic obstacles. The framework employs two complementary methods for human trajectory prediction: history prediction and orientation prediction. By integrating these predicted trajectories into the occupancy map, the robot can proactively avoid potential collisions with humans while efficiently navigating to its destination. Extensive experiments on the Social-HM3D and Social-MP3D datasets demonstrate that SocialNav-Map significantly outperforms state-of-the-art (SOTA) RL-based methods, which require 2,396 GPU hours of training. Notably, it reduces human collision rates by over 10% without necessitating any training in novel environments. By eliminating the need for environment-specific training, SocialNav-Map achieves superior navigation performance, paving the way for the deployment of social navigation systems in real-world environments characterized by diverse human behaviors. The code is available at: https://github.com/linglingxiansen/SocialNav-Map.
翻译:在人口密集的动态环境中进行社交导航对自主移动机器人构成了重大挑战,需要先进的安全交互策略。现有的基于强化学习(RL)的方法需要超过2000小时的广泛训练,且往往难以泛化到陌生环境而无需额外微调,这限制了其在实际场景中的应用。为解决这些局限性,我们提出了SocialNav-Map,一种新颖的零样本社交导航框架,将动态人类轨迹预测与占据栅格地图相结合,无需环境特定训练即可实现安全高效的导航。具体而言,SocialNav-Map首先将任务目标位置转换到构建的地图坐标系中。随后,它创建一个动态占据栅格地图,将预测的人类运动作为动态障碍物纳入其中。该框架采用两种互补的方法进行人类轨迹预测:历史预测和朝向预测。通过将这些预测轨迹整合到占据栅格地图中,机器人能够主动避免与人类的潜在碰撞,同时高效导航至目的地。在Social-HM3D和Social-MP3D数据集上的大量实验表明,SocialNav-Map显著优于需要2,396 GPU小时训练的最先进(SOTA)基于RL的方法。值得注意的是,它在无需任何新环境训练的情况下,将人类碰撞率降低了超过10%。通过消除对环境特定训练的需求,SocialNav-Map实现了卓越的导航性能,为社交导航系统在具有多样化人类行为的真实世界环境中的部署铺平了道路。代码可在以下网址获取:https://github.com/linglingxiansen/SocialNav-Map。