Visual Odometry

Visual Odometry（VO）は、camera image の連続 frame から camera motion を推定する task です。Robot や vehicle がどのように動いたかを、visual input から逐次的に推定します。

SLAM との違い

VO は主に odometry、つまり frame 間の motion estimation に焦点を当てます。一方で、SLAM は camera pose estimation に加えて map construction と loop closure を含みます。

項目	Visual Odometry	SLAM
主目的	逐次的な camera motion 推定	自己位置推定と map 構築
Loop closure	通常は含まない	重要な component
Drift 修正	限定的	Loop closure と global optimization で修正

Feature-based VO

Feature-based VO は、frame 間で feature を matching し、epipolar geometry、PnP、triangulation などを使って motion を推定します。

frame t, frame t+1
  -> feature matching
  -> essential matrix / PnP
  -> pose estimation
  -> local optimization

Direct VO

Direct VO は、feature descriptor を使わず、pixel intensity の photometric error を直接最小化します。

\sum_{\mathbf{x}} \|I_t(\mathbf{x}) - I_{t+1}(w(\mathbf{x}; \mathbf{T}))\|^2

ここで、 $w$ は pose $\mathbf{T}$ による warp です。Direct method は texture の多い領域全体を使える一方で、illumination change に敏感です。

RGB-D / LiDAR / Visual-Inertial Odometry

VO は camera だけに限りません。RGB-D camera を使えば depth が直接得られ、LiDAR を使えば 3D scan matching ができます。IMU を組み合わせる Visual-Inertial Odometry（VIO）では、短時間の motion constraint が強くなり、rapid motion や motion blur に頑健になります。

主なソース

Scaramuzza and Fraundorfer, “Visual Odometry Tutorial”, IEEE Robotics & Automation Magazine: https://rpg.ifi.uzh.ch/docs/VO_Part_I_Scaramuzza.pdf
Engel et al., “Direct Sparse Odometry”, 2016: https://arxiv.org/abs/1607.02565
Forster et al., “SVO: Fast Semi-Direct Monocular Visual Odometry”, 2014: https://rpg.ifi.uzh.ch/docs/ICRA14_Forster.pdf

SLAM との違い​

Feature-based VO​

Direct VO​

RGB-D / LiDAR / Visual-Inertial Odometry​

関連ページ​

主なソース​

SLAM との違い

Feature-based VO

Direct VO

RGB-D / LiDAR / Visual-Inertial Odometry

関連ページ

主なソース