Indoor scenes are characterized by a high intra-class variability, mainly due to the intrinsic variety of the objects in them, and to the drastic image variations due to (even small) view-point changes. One of the main trends in the literature has been to employ representations coupling statistical characterizations of the image, with a description of their spatial distribution. This is usually done by combining multiple representations of different image regions, most often using a fixed 44, or pyramidal image-partitioning scheme. While these encodings are able to capture the spatial regularities of the problem, they are unsuitable to handle its spatial variabilities. In this work we propose to complement a traditional spatial-encoding scheme with a bottom-up approach designed to discover visual-structures regardless of their exact position in the scene. To this end we use saliency maps to segment each image in two regions: The most and least salient 50%. This segmentation provides a description of images which is somehow related to the relative semantics of the discovered regions, complementing the canonical spatial description. We evaluated the proposed technique on three public scene recognition datasets. Our results prove this approach to be effective in the indoor scenario, while being also meaningful for other scene categorization tasks.

Indoor scene recognition using task and saliency-driven feature pooling / Fornoni, Marco; Caputo, Barbara. - ELETTRONICO. - (2012), pp. 98.1-98.12. (Intervento presentato al convegno 2012 23rd British Machine Vision Conference, BMVC 2012 tenutosi a Guildford, Surrey; UK nel 03-07 September 2012) [10.5244/C.26.98].

Indoor scene recognition using task and saliency-driven feature pooling

CAPUTO, BARBARA
2012

Abstract

Indoor scenes are characterized by a high intra-class variability, mainly due to the intrinsic variety of the objects in them, and to the drastic image variations due to (even small) view-point changes. One of the main trends in the literature has been to employ representations coupling statistical characterizations of the image, with a description of their spatial distribution. This is usually done by combining multiple representations of different image regions, most often using a fixed 44, or pyramidal image-partitioning scheme. While these encodings are able to capture the spatial regularities of the problem, they are unsuitable to handle its spatial variabilities. In this work we propose to complement a traditional spatial-encoding scheme with a bottom-up approach designed to discover visual-structures regardless of their exact position in the scene. To this end we use saliency maps to segment each image in two regions: The most and least salient 50%. This segmentation provides a description of images which is somehow related to the relative semantics of the discovered regions, complementing the canonical spatial description. We evaluated the proposed technique on three public scene recognition datasets. Our results prove this approach to be effective in the indoor scenario, while being also meaningful for other scene categorization tasks.
2012
2012 23rd British Machine Vision Conference, BMVC 2012
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Indoor scene recognition using task and saliency-driven feature pooling / Fornoni, Marco; Caputo, Barbara. - ELETTRONICO. - (2012), pp. 98.1-98.12. (Intervento presentato al convegno 2012 23rd British Machine Vision Conference, BMVC 2012 tenutosi a Guildford, Surrey; UK nel 03-07 September 2012) [10.5244/C.26.98].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/951711
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 8
social impact