The retrieval of the 3D pose and shape of objects from images is an ill-posed problem. A common way to object reconstruction is to match entities such as keypoints, edges, or contours of a deformable 3D model, used as shape prior, to their corresponding entities inferred from the image. However, such approaches are highly sensitive to model initialisation, imprecise keypoint localisations and/or illumination conditions. In this paper, we present a probabilistic approach for shape-aware 3D vehicle reconstruction from stereo images that leverages the outputs of a novel multi-task CNN. Specifically, we train a CNN that outputs probability distributions for the vehicle's orientation and for both, vehicle keypoints and wireframe edges. Together with 3D stereo information we integrate the predicted distributions into a common probabilistic framework. We believe that the CNN-based detection of wireframe edges reduces the sensitivity to illumination conditions and object contrast and that using the raw probability maps instead of inferring keypoint positions reduces the sensitivity to keypoint localisation errors. We show that our method achieves state-of-the-art results, evaluating our method on the challenging KITTI benchmark and on our own new 'Stereo-Vehicle' dataset.
翻译:从图像中检索 3D 方形和对象形状是一个错误的问题。 反对重建的一个常见方法是将诸如关键点、边缘或变形的 3D 模型的轮廓或轮廓等实体与从图像中推断出的相应实体相匹配。 但是,这类方法对模型初始化、不精确的关键点定位和/或照明条件非常敏感。 在本文中,我们提出了一个从立体图像中生成的3D 立体车辆重建的概率方法,该立体图像利用了新型多任务CNN的输出结果。具体地说,我们训练了一台CNN,它输出出该车辆方向以及车辆关键点和铁丝框边缘的概率分布。连同3D 立体信息,我们把预测的分布纳入一个共同的概率框架。我们相信,CNN对电线框边缘的检测降低了对污染条件和对象对比的敏感度,并且使用原始概率地图而不是推导出关键点位置,降低了对关键点定位错误的敏感度。我们用的方法在自己的KIT-I 上实现了具有挑战性基调的基调结果。