Scene understanding of satellite and aerial images is a pivotal task in various remote sensing (RS) practices, such as land cover and urban development monitoring. In recent years, neural networks have become a de-facto standard in many of these applications. However, semantic segmentation still remains a challenging task. With respect to other computer vision (CV) areas, in RS large labeled datasets are not very often available, due to their large cost and to the required manpower. On the other hand, self-supervised learning (SSL) is earning more and more interest in CV, reaching state-of-the-art in several tasks. In spite of this, most SSL models, pretrained on huge datasets like ImageNet, do not perform particularly well on RS data. For this reason, we propose a combination of a SSL algorithm (particularly, Online Bag of Words) and a semantic segmentation algorithm, shaped for aerial images (namely, Multistage Attention ResU-Net), to show new encouraging results (i.e., 81.76% mIoU with ResNet-18 backbone) on the ISPRS Vaihingen dataset.

MARE: Self-supervised multi-attention REsu-net for semantic segmentation in remote sensing / Marsocci, V.; Scardapane, S.; Komodakis, N.. - In: REMOTE SENSING. - ISSN 2072-4292. - 13:16(2021). [10.3390/rs13163275]

MARE: Self-supervised multi-attention REsu-net for semantic segmentation in remote sensing

Marsocci V.
;
Scardapane S.;
2021

Abstract

Scene understanding of satellite and aerial images is a pivotal task in various remote sensing (RS) practices, such as land cover and urban development monitoring. In recent years, neural networks have become a de-facto standard in many of these applications. However, semantic segmentation still remains a challenging task. With respect to other computer vision (CV) areas, in RS large labeled datasets are not very often available, due to their large cost and to the required manpower. On the other hand, self-supervised learning (SSL) is earning more and more interest in CV, reaching state-of-the-art in several tasks. In spite of this, most SSL models, pretrained on huge datasets like ImageNet, do not perform particularly well on RS data. For this reason, we propose a combination of a SSL algorithm (particularly, Online Bag of Words) and a semantic segmentation algorithm, shaped for aerial images (namely, Multistage Attention ResU-Net), to show new encouraging results (i.e., 81.76% mIoU with ResNet-18 backbone) on the ISPRS Vaihingen dataset.
2021
linear attention; self-supervised learning; semantic segmentation; vaihingen dataset; remote sensing; deep learning
01 Pubblicazione su rivista::01a Articolo in rivista
MARE: Self-supervised multi-attention REsu-net for semantic segmentation in remote sensing / Marsocci, V.; Scardapane, S.; Komodakis, N.. - In: REMOTE SENSING. - ISSN 2072-4292. - 13:16(2021). [10.3390/rs13163275]
File allegati a questo prodotto
File Dimensione Formato  
Marsocci_MARE_2021.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 2.75 MB
Formato Adobe PDF
2.75 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1583234
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 11
social impact