Data scarcity remains a fundamental bottleneck for embodied intelligence. Existing approaches use large language models (LLMs) to automate gripper-based simulation generation, but they transfer poorly to dexterous manipulation, which demands more specialized environment design. Meanwhile, dexterous manipulation tasks are inherently more difficult due to their higher degrees of freedom. Massively generating feasible and trainable dexterous hand tasks remains an open challenge. To this end, we present GenDexHand, a generative simulation pipeline that autonomously produces diverse robotic tasks and environments for dexterous manipulation. GenDexHand introduces a closed-loop refinement process that adjusts object placements and scales based on vision-language model (VLM) feedback, substantially improving the average quality of generated environments. Each task is further decomposed into sub-tasks to enable sequential reinforcement learning, reducing training time and increasing success rates. Our work provides a viable path toward scalable training of diverse dexterous hand behaviors in embodied intelligence by offering a simulation-based solution to synthetic data generation. Our website: https://winniechen2002.github.io/GenDexHand/.
翻译:数据稀缺仍然是具身智能面临的根本瓶颈。现有方法利用大语言模型(LLM)自动化基于夹爪的仿真生成,但这些方法难以迁移到灵巧操作任务上,因为后者需要更专业的环境设计。同时,灵巧操作任务由于其更高的自由度而本质上更为困难。大规模生成可行且可训练的灵巧手任务仍是一个开放挑战。为此,我们提出了GenDexHand,一种生成式仿真流水线,能够自主为灵巧操作生成多样化的机器人任务和环境。GenDexHand引入了一个闭环优化过程,该过程基于视觉语言模型(VLM)的反馈调整物体放置和尺寸,显著提升了生成环境的平均质量。每个任务被进一步分解为子任务,以实现序列强化学习,从而减少训练时间并提高成功率。我们的工作通过提供一种基于仿真的合成数据生成方案,为具身智能中多样化灵巧手行为的可扩展训练提供了一条可行路径。项目网站:https://winniechen2002.github.io/GenDexHand/。