Monocular visual odometry is a fundamental problem in computer vision and it was extensively studied in literature. The vast majority of visual odometry algorithms are based on a standard pipeline consisting in feature detection, feature matching, motion estimation and local optimization. Only recently, deep learning approaches have shown cutting-edge performance, replacing the standard pipeline with an end-to-end solution. One of the main advantages of deep learning approaches over the standard methods is the reduced inference time, that is an important requirement for the application of visual odometry in real-time. Less emphasis, however, has been placed on memory requirements and training efficiency. The memory footprint, in particular, is important for real world applications such as robot navigation or autonomous driving, where the devices have limited memory resources. In this paper we tackle both aspects introducing novel architectures based on Depth-Wise Separable Convolutional Neural Network and deep Quaternion Recurrent Convolutional Neural Network. In particular, we obtain equal or better accuracy with respect to the other state-of-the-art methods on the KITTI VO dataset with a reduction of the number of parameters and a speed-up in the inference time.

Visual odometry with depth-wise separable convolution and quaternion neural networks / De Magistris, G.; Comminiello, D.; Napoli, C.; Starczewski, J. T.. - 3417:(2023), pp. 70-80. (Intervento presentato al convegno 9th Italian workshop on artificial intelligence and robotics. AIRO 2022 tenutosi a Udine; Italy).

Visual odometry with depth-wise separable convolution and quaternion neural networks

De Magistris G.
Primo
Investigation
;
Comminiello D.
Secondo
Resources
;
Napoli C.
Penultimo
Supervision
;
2023

Abstract

Monocular visual odometry is a fundamental problem in computer vision and it was extensively studied in literature. The vast majority of visual odometry algorithms are based on a standard pipeline consisting in feature detection, feature matching, motion estimation and local optimization. Only recently, deep learning approaches have shown cutting-edge performance, replacing the standard pipeline with an end-to-end solution. One of the main advantages of deep learning approaches over the standard methods is the reduced inference time, that is an important requirement for the application of visual odometry in real-time. Less emphasis, however, has been placed on memory requirements and training efficiency. The memory footprint, in particular, is important for real world applications such as robot navigation or autonomous driving, where the devices have limited memory resources. In this paper we tackle both aspects introducing novel architectures based on Depth-Wise Separable Convolutional Neural Network and deep Quaternion Recurrent Convolutional Neural Network. In particular, we obtain equal or better accuracy with respect to the other state-of-the-art methods on the KITTI VO dataset with a reduction of the number of parameters and a speed-up in the inference time.
2023
9th Italian workshop on artificial intelligence and robotics. AIRO 2022
computer vision; convolution; convolutional neural networks; motion estimation; recurrent neural networks; robots; vision; convolutional neural network; cutting edges; detection features; features detections; features matching; learning approach; local optimizations; neural-networks; odometry algorithms; visual odometry
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Visual odometry with depth-wise separable convolution and quaternion neural networks / De Magistris, G.; Comminiello, D.; Napoli, C.; Starczewski, J. T.. - 3417:(2023), pp. 70-80. (Intervento presentato al convegno 9th Italian workshop on artificial intelligence and robotics. AIRO 2022 tenutosi a Udine; Italy).
File allegati a questo prodotto
File Dimensione Formato  
De Magistris_Visual Odometry_2023.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 3.52 MB
Formato Adobe PDF
3.52 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1686143
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact