学习2d-3d对应关系以解决盲目的视角n点问题

论文标题

学习2d-3d对应关系以解决盲目的视角n点问题

Learning 2D-3D Correspondences To Solve The Blind Perspective-n-Point Problem

论文作者

Liu, Liu, Campbell, Dylan, Li, Hongdong, Zhou, Dingfu, Song, Xibin, Yang, Ruigang

论文摘要

传统的绝对摄像机通过透视n点（PNP）求解器姿势通常假设给出了2D图像像素和3D点之间的对应关系。当不知道2D和3D点之间的对应关系时，任务将成为更具挑战性的盲目PNP问题。本文提出了一个深CNN模型，该模型同时求解了6-DOF绝对摄像头姿势和2d--3d对应关系。我们的模型包括连接序列的三个神经模块。首先，直接将双向点网络启发的网络应用于2D Image Kepoints和3D场景点，以便提取歧视点的特征，以利用本地和上下文信息。其次，使用一个全局功能匹配模块来估计所有2d--3d对之间的对抗性矩阵。第三，获得的对可差矩阵被馈入分类模块，以消除歧义距离匹配。整个网络是端到端训练的，然后在测试时间进行健壮的模型拟合（P3P-RANSAC），只是为了恢复6DOF相机姿势。对真实数据和模拟数据的广泛测试表明，我们的方法基本上要优于现有方法，并且能够以最先进的精度处理数千点。

Conventional absolute camera pose via a Perspective-n-Point (PnP) solver often assumes that the correspondences between 2D image pixels and 3D points are given. When the correspondences between 2D and 3D points are not known a priori, the task becomes the much more challenging blind PnP problem. This paper proposes a deep CNN model which simultaneously solves for both the 6-DoF absolute camera pose and 2D--3D correspondences. Our model comprises three neural modules connected in sequence. First, a two-stream PointNet-inspired network is applied directly to both the 2D image keypoints and the 3D scene points in order to extract discriminative point-wise features harnessing both local and contextual information. Second, a global feature matching module is employed to estimate a matchability matrix among all 2D--3D pairs. Third, the obtained matchability matrix is fed into a classification module to disambiguate inlier matches. The entire network is trained end-to-end, followed by a robust model fitting (P3P-RANSAC) at test time only to recover the 6-DoF camera pose. Extensive tests on both real and simulated data have shown that our method substantially outperforms existing approaches, and is capable of processing thousands of points a second with the state-of-the-art accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题