Generative modeling has recently shown remarkable promise for visuomotor policy learning, enabling flexible and expressive control across diverse embodied AI tasks. However, existing generative policies often struggle with data inefficiency, requiring large-scale demonstrations, and sampling inefficiency, incurring slow action generation during inference. We introduce EfficientFlow, a unified framework for efficient embodied AI with flow-based policy learning. To enhance data efficiency, we bring equivariance into flow matching. We theoretically prove that when using an isotropic Gaussian prior and an equivariant velocity prediction network, the resulting action distribution remains equivariant, leading to improved generalization and substantially reduced data demands. To accelerate sampling, we propose a novel acceleration regularization strategy. As direct computation of acceleration is intractable for marginal flow trajectories, we derive a novel surrogate loss that enables stable and scalable training using only conditional trajectories. Across a wide range of robotic manipulation benchmarks, the proposed algorithm achieves competitive or superior performance under limited data while offering dramatically faster inference. These results highlight EfficientFlow as a powerful and efficient paradigm for high-performance embodied AI.
翻译:生成建模近期在视觉运动策略学习中展现出显著潜力,能够为多样化的具身AI任务提供灵活且富有表现力的控制。然而,现有的生成策略常面临数据效率低下(需要大规模演示数据)和采样效率不足(推理过程中动作生成缓慢)的问题。本文提出EfficientFlow,一个基于流策略学习的高效具身AI统一框架。为提升数据效率,我们将等变性引入流匹配。理论证明,当采用各向同性高斯先验和等变速度预测网络时,生成的动作分布保持等变性,从而改善泛化能力并显著降低数据需求。为加速采样,我们提出一种新颖的加速度正则化策略。由于直接计算边缘流轨迹的加速度难以处理,我们推导出一种新的替代损失函数,仅利用条件轨迹即可实现稳定且可扩展的训练。在广泛的机器人操作基准测试中,所提算法在有限数据条件下取得了竞争性或更优的性能,同时提供显著更快的推理速度。这些结果凸显了EfficientFlow作为高性能具身AI的强大且高效范式。