Catalogo dei prodotti della ricerca

Deep networks have brought significant advances in robot perception, enabling to improve the capabilities of robots in several visual tasks, ranging from object detection and recognition to pose estimation, semantic scene segmentation and many others. Still, most approaches typically address visual tasks in isolation, resulting in overspecialized models which achieve strong performances in specific applications but work poorly in other (often related) tasks. This is clearly sub-optimal for a robot which is often required to perform simultaneously multiple visual recognition tasks in order to properly act and interact with the environment. This problem is exacerbated by the limited computational and memory resources typically available onboard to a robotic platform. The problem of learning flexible models which can handle multiple tasks in a lightweight manner has recently gained attention in the computer vision community and benchmarks supporting this research have been proposed. In this work we study this problem in the robot vision context, proposing a new benchmark, the RGB-D Triathlon, and evaluating state of the art algorithms in this novel challenging scenario. We also define a new evaluation protocol, better suited to the robot vision setting. Results shed light on the strengths and weaknesses of existing approaches and on open issues, suggesting directions for future research.

The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots / Cermelli, Fabio; Mancini, Massimiliano; Ricci, Elisa; Caputo, Barbara. - (2019), pp. 6097-6104. (Intervento presentato al convegno 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) tenutosi a Macau; China) [10.1109/IROS40897.2019.8968562].

The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots

Fabio Cermelli;Massimiliano Mancini;Elisa Ricci;Barbara Caputo

2019

Abstract

Deep networks have brought significant advances in robot perception, enabling to improve the capabilities of robots in several visual tasks, ranging from object detection and recognition to pose estimation, semantic scene segmentation and many others. Still, most approaches typically address visual tasks in isolation, resulting in overspecialized models which achieve strong performances in specific applications but work poorly in other (often related) tasks. This is clearly sub-optimal for a robot which is often required to perform simultaneously multiple visual recognition tasks in order to properly act and interact with the environment. This problem is exacerbated by the limited computational and memory resources typically available onboard to a robotic platform. The problem of learning flexible models which can handle multiple tasks in a lightweight manner has recently gained attention in the computer vision community and benchmarks supporting this research have been proposed. In this work we study this problem in the robot vision context, proposing a new benchmark, the RGB-D Triathlon, and evaluating state of the art algorithms in this novel challenging scenario. We also define a new evaluation protocol, better suited to the robot vision setting. Results shed light on the strengths and weaknesses of existing approaches and on open issues, suggesting directions for future research.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2019
			
	Nome convegno
	
				2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
			
	Parole chiave
	
				multi-task learning; visual learning; robot vision
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots / Cermelli, Fabio; Mancini, Massimiliano; Ricci, Elisa; Caputo, Barbara. - (2019), pp. 6097-6104. (Intervento presentato al  convegno 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) tenutosi a Macau; China) [10.1109/IROS40897.2019.8968562].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Cermelli_The-RGB-D-Triathlon_2019.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 2.02 MB Formato Adobe PDF Contatta l'autore	2.02 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1334128

Citazioni

ND

2

1

social impact