Despite the impressive progress brought by deep network in visual object recognition, robot vision is still far from being a solved problem. The most successful convolutional architectures are developed starting from ImageNet, a large scale collection of images of object categories downloaded from the Web. This kind of images is very different from the situated and embodied visual experience of robots deployed in unconstrained settings. To reduce the gap between these two visual experiences, this paper proposes a simple yet effective data augmentation layer that zooms on the object of interest and simulates the object detection outcome of a robot vision system. The layer, that can be used with any convolutional deep architecture, brings to an increase in object recognition performance of up to 7{\%}, in experiments performed over three different benchmark databases. An implementation of our robot data augmentation layer has been made publicly available.
Bridging Between Computer and Robot Vision Through Data Augmentation: A Case Study on Object Recognition / D'Innocente, Antonio; Carlucci, FABIO MARIA; Colosi, Mirco; Caputo, Barbara. - ELETTRONICO. - 10528:(2017), pp. 384-393. ((Intervento presentato al convegno 11th International Conference, ICVS 2017 tenutosi a Shenzhen; China [10.1007/978-3-319-68345-4_34].
Bridging Between Computer and Robot Vision Through Data Augmentation: A Case Study on Object Recognition
D'INNOCENTE, ANTONIO
;Fabio Maria Carlucci
;Mirco Colosi
;Barbara Caputo
2017
Abstract
Despite the impressive progress brought by deep network in visual object recognition, robot vision is still far from being a solved problem. The most successful convolutional architectures are developed starting from ImageNet, a large scale collection of images of object categories downloaded from the Web. This kind of images is very different from the situated and embodied visual experience of robots deployed in unconstrained settings. To reduce the gap between these two visual experiences, this paper proposes a simple yet effective data augmentation layer that zooms on the object of interest and simulates the object detection outcome of a robot vision system. The layer, that can be used with any convolutional deep architecture, brings to an increase in object recognition performance of up to 7{\%}, in experiments performed over three different benchmark databases. An implementation of our robot data augmentation layer has been made publicly available.File | Dimensione | Formato | |
---|---|---|---|
DInnocente_Postprint_Bridging-Between-Computer_2017.pdf
accesso aperto
Note: https://link.springer.com/chapter/10.1007/978-3-319-68345-4_34
Tipologia:
Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
829.11 kB
Formato
Adobe PDF
|
829.11 kB | Adobe PDF | Visualizza/Apri PDF |
DInnocente_Bridging-Between-Computer_2017.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
941.68 kB
Formato
Adobe PDF
|
941.68 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.