Technology Sharing

Visual SLAM and positioning: front-end feature points and matching

2024-07-12

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Reference articles or links

Image Matching from Handcrafted to Deep Features: A Survey
Image Matching across Wide Baselines: From Paper to Practice
Image Registration Techniques: A Survey
Map-based visual positioning

Evaluation of feature point performance

References:

Repeatability
Average corner detection error
Corner positioning error

In short, a good feature point extractor can stably and accurately extract the 2D points formed by projecting the same spatial 3D points onto the image under different changing conditions (such as viewing angle, lighting, etc.).

Traditional feature points and descriptors (feature points only or feature points + descriptors)

Visual feature points were initially designed based on researchers’ thinking, such as SIFT, SURF, ORB and Harris, which were also used in visual slam or positioning in the same period.
[Harris
,1988] VINS
[Shi-Tomasi, 1994] MonoSLAM
[FAST
,1998] ORB_SLAM, T265 VIOMSCKF-VIO, OpenVSLAM, OKVIS, ROVIO, PTAM
[Blob and Corner] SOFT-SLAM
[SIFT
, 1999] MSCKF
[FREAK,2012] Vision-Aided Localization For Ground Robots

Traditional Descriptor

[BRIEF, 2010] is often used in conjunction with fast corner points, such asORB_SLAMLDSOwait
[BRISK, 2011]: An improvement of BRIEF, with scale and rotation invariance

Limitations of traditional descriptors: They are designed based on human perception and lack specificity for computers, for example, they cannot cope with changes in lighting and viewing angle.

Feature points based on deep learning

CovDet
Quad-networks
AffNet
KeyNet
MagicPoint

Descriptors based on deep learning

L2Net: New Sampling Mode and Error, CVPR2017
DeepCD: Floating-point descriptors complement binary descriptors, ICCV 2017
Spread-out: Learning the spatial distribution of descriptors, ICCV 2017
HardNet: Improved Error Based on L2Net, NIPS2017
SoSNet: A learning descriptor based on second-order similarity regularization, CVPR 2019
GIFT: Learning scale- and rotation-invariant descriptors using group volumes, NIPS 2019
S2DNet: Converting descriptor learning into a classification problem and training it in a sparse-to-dense manner, ECCV2020
CAPS: Descriptor learning using only epipolar constraints.

Feature points + descriptors based on deep learning

SuperPoint: Self-supervised feature point and descriptor learning, with certain robustness to lighting, DX-SLAM, CVPR2018
LIFT: Learning-based Invariant Feature Transformation, 2016:
DISK:Using the policy gradient method in reinforcement learning to learn feature extraction and description, it has certain robustness to weak text and science areas, NIPS2020
R2D2: Proposed for the repeatability and reliability of feature points, NeurIPS2019
D2Net: Local Features Jointly Detect and Describe Trainable CNN, CVPR2019
ASLFeat: Local descriptor learning for accurate shape and positioning, CVPR2020

Feature points and descriptors based on deep learning are often proposed to address the shortcomings of traditional methods in practical applications. They are more robust to lighting, viewing angle, etc.

Feature Matching

Nearest Neighbor Knn Matching
FLANN matching algorithm
GMS: Fast and robust feature matching using motion smoothing information, CVPR2017
AdaLAM: An error matching elimination algorithm that takes into account both the corresponding point distribution and affine consistency, and performs affine transformation based on RANSAC in image blocks.
SGM-Nets: Semi-global matching using neural networks, CVPR 2017
PointCN: After brute force matching, using multi-layer perceptron to find wrong matches, CVPR2018
SuperGlue Match:Robust Matching Based on Graph Neural Network and Attention Mechanism, CVPR2020
LoFTR: Local feature matching without feature extractor using Transformer, CVPR2021