Nowadays the accurate geo-localization of ground-view images has an important role across domains as diverse as journalism, forensics analysis, transports, and Earth Observation. This work addresses the problem of matching a query ground-view image with the corresponding satellite image without GPS data. This is done by comparing the features from a ground-view image and a satellite one, innovatively leveraging the corresponding latter's segmentation mask through a three-stream Siamese-like network. The proposed method, Semantic Align Net (SAN), focuses on limited Field-of-View (FoV) and ground panorama images (images with a FoV of 360°). The novelty lies in the fusion of satellite images in combination with their segmentation masks, aimed at ensuring that the model can extract useful features and focus on the significant parts of the images. This work shows how SAN through semantic analysis of images improves the performance on the unlabelled CVUSA dataset for all the tested FoVs.
A Semantic Segmentation-Guided Approach for Ground-to-Aerial Image Matching / Pro, F.; Dionelis, N.; Maiano, L.; Le Saux, B.; Amerini, I.. - (2024), pp. 2630-2635. (Intervento presentato al convegno 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 tenutosi a Grecia) [10.1109/IGARSS53475.2024.10642526].
A Semantic Segmentation-Guided Approach for Ground-to-Aerial Image Matching
Pro F.;Maiano L.;Amerini I.
2024
Abstract
Nowadays the accurate geo-localization of ground-view images has an important role across domains as diverse as journalism, forensics analysis, transports, and Earth Observation. This work addresses the problem of matching a query ground-view image with the corresponding satellite image without GPS data. This is done by comparing the features from a ground-view image and a satellite one, innovatively leveraging the corresponding latter's segmentation mask through a three-stream Siamese-like network. The proposed method, Semantic Align Net (SAN), focuses on limited Field-of-View (FoV) and ground panorama images (images with a FoV of 360°). The novelty lies in the fusion of satellite images in combination with their segmentation masks, aimed at ensuring that the model can extract useful features and focus on the significant parts of the images. This work shows how SAN through semantic analysis of images improves the performance on the unlabelled CVUSA dataset for all the tested FoVs.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.