Accurate localization is essential for autonomous vehicles, yet sensor noise and drift over time can lead to significant pose estimation errors, particularly in long-horizon environments. A common strategy for correcting accumulated error is visual loop closure in SLAM, which adjusts the pose graph when the agent revisits previously mapped locations. These techniques typically rely on identifying visual mappings between the current view and previously observed scenes and often require fusing data from multiple sensors. In contrast, this work introduces NeRF-Assisted 3D-3D Pose Alignment (NAP3D), a complementary approach that leverages 3D-3D correspondences between the agent's current depth image and a pre-trained Neural Radiance Field (NeRF). By directly aligning 3D points from the observed scene with synthesized points from the NeRF, NAP3D refines the estimated pose even from novel viewpoints, without relying on revisiting previously observed locations. This robust 3D-3D formulation provides advantages over conventional 2D-3D localization methods while remaining comparable in accuracy and applicability. Experiments demonstrate that NAP3D achieves camera pose correction within 5 cm on a custom dataset, robustly outperforming a 2D-3D Perspective-N-Point baseline. On TUM RGB-D, NAP3D consistently improves 3D alignment RMSE by approximately 6 cm compared to this baseline given varying noise, despite PnP achieving lower raw rotation and translation parameter error in some regimes, highlighting NAP3D's improved geometric consistency in 3D space. By providing a lightweight, dataset-agnostic tool, NAP3D complements existing SLAM and localization pipelines when traditional loop closure is unavailable.
翻译:精确定位对于自动驾驶车辆至关重要,然而传感器噪声和随时间累积的漂移会导致显著的位姿估计误差,尤其在长距离环境中。纠正累积误差的常见策略是SLAM中的视觉闭环检测,即在智能体重新访问先前已建图区域时调整位姿图。这些技术通常依赖于识别当前视角与先前观测场景之间的视觉映射,并常需融合多传感器数据。相比之下,本研究提出NeRF辅助的三维-三维位姿对齐方法(NAP3D),作为一种补充性方案,该方法利用智能体当前深度图像与预训练神经辐射场(NeRF)之间的三维-三维对应关系。通过将观测场景的三维点与NeRF合成的三维点直接对齐,NAP3D即使从新视角也能优化估计位姿,且无需依赖重访先前观测位置。这种鲁棒的三维-三维建模相较于传统二维-三维定位方法具有优势,同时在精度和适用性上保持相当水平。实验表明,在自定义数据集上,NAP3D可实现5厘米以内的相机位姿校正,稳健地超越了二维-三维PnP基线方法。在TUM RGB-D数据集上,面对不同程度的噪声干扰,NAP3D相较该基线持续将三维对齐均方根误差改善约6厘米——尽管PnP在某些情况下能获得更低的原始旋转与平移参数误差,这凸显了NAP3D在三维空间中几何一致性的提升。通过提供一种轻量级、与数据集无关的工具,NAP3D在传统闭环检测不可用时,为现有SLAM与定位流程提供了有效补充。