In many robotics and VR/AR applications, fast camera motions lead to a high level of motion blur, causing existing camera pose estimation methods to fail. In this work, we propose a novel framework that leverages motion blur as a rich cue for motion estimation rather than treating it as an unwanted artifact. Our approach works by predicting a dense motion flow field and a monocular depth map directly from a single motion-blurred image. We then recover the instantaneous camera velocity by solving a linear least squares problem under the small motion assumption. In essence, our method produces an IMU-like measurement that robustly captures fast and aggressive camera movements. To train our model, we construct a large-scale dataset with realistic synthetic motion blur derived from ScanNet++v2 and further refine our model by training end-to-end on real data using our fully differentiable pipeline. Extensive evaluations on real-world benchmarks demonstrate that our method achieves state-of-the-art angular and translational velocity estimates, outperforming current methods like MASt3R and COLMAP.
翻译:在许多机器人与虚拟现实/增强现实应用中,快速的相机运动会导致严重的运动模糊,使得现有的相机位姿估计方法失效。本研究提出了一种新颖的框架,将运动模糊作为运动估计的丰富线索而非待消除的伪影。我们的方法通过从单张运动模糊图像直接预测稠密运动流场与单目深度图,在小运动假设下通过求解线性最小二乘问题恢复瞬时相机速度。本质上,该方法生成了一种类惯性测量单元的测量结果,能够稳健地捕捉快速剧烈的相机运动。为训练模型,我们基于ScanNet++v2构建了大规模真实感合成运动模糊数据集,并通过全可微分流程在真实数据上进行端到端训练以优化模型。在真实世界基准测试中的广泛评估表明,该方法在角速度与平移速度估计上达到了最先进水平,其性能超越了MASt3R和COLMAP等现有方法。