Recently, equivariant neural networks for policy learning have shown promising improvements in sample efficiency and generalization, however, their wide adoption faces substantial barriers due to implementation complexity. Equivariant architectures typically require specialized mathematical formulations and custom network design, posing significant challenges when integrating with modern policy frameworks like diffusion-based models. In this paper, we explore a number of straightforward and practical approaches to incorporate symmetry benefits into diffusion policies without the overhead of full equivariant designs. Specifically, we investigate (i) invariant representations via relative trajectory actions and eye-in-hand perception, (ii) integrating equivariant vision encoders, and (iii) symmetric feature extraction with pretrained encoders using Frame Averaging. We first prove that combining eye-in-hand perception with relative or delta action parameterization yields inherent SE(3)-invariance, thus improving policy generalization. We then perform a systematic experimental study on those design choices for integrating symmetry in diffusion policies, and conclude that an invariant representation with equivariant feature extraction significantly improves the policy performance. Our method achieves performance on par with or exceeding fully equivariant architectures while greatly simplifying implementation.


翻译:近年来,用于策略学习的等变神经网络在样本效率和泛化能力方面展现出有前景的改进,然而,由于实现复杂性,其广泛应用面临显著障碍。等变架构通常需要专门的数学公式和定制网络设计,在与基于扩散的模型等现代策略框架集成时构成重大挑战。本文探讨了若干简单实用的方法,将对称性优势融入扩散策略,而无需完全等变设计的开销。具体而言,我们研究了(i)通过相对轨迹动作和手眼感知的不变表示,(ii)集成等变视觉编码器,以及(iii)使用帧平均(Frame Averaging)的预训练编码器进行对称特征提取。我们首先证明,将手眼感知与相对或增量动作参数化相结合,可产生固有的SE(3)不变性,从而提升策略泛化能力。随后,我们对这些在扩散策略中融入对称性的设计选择进行了系统性实验研究,并得出结论:采用等变特征提取的不变表示能显著提升策略性能。我们的方法实现了与完全等变架构相当或更优的性能,同时极大地简化了实现过程。

0
下载
关闭预览

相关内容

Top
微信扫码咨询专知VIP会员