Catalogo dei prodotti della ricerca

Persistent homology (PH) is a powerful mathematical method to automatically extract relevant insights from images, such as those obtained by high-resolution imaging devices like electron microscopes or new-generation telescopes. However, the application of this method comes at a very high computational cost that is bound to explode more because new imaging devices generate an ever-growing amount of data. In this paper, we present PixHomology, a novel algorithm for efficiently computing zero-dimensional PH on 2D images, optimizing memory and processing time. By leveraging the Apache Spark framework, we also present a distributed version of our algorithm with several optimized variants, able to concurrently process large batches of astronomical images. Finally, we present the results of an experimental analysis showing that our algorithm and its distributed version are efficient in terms of required memory, execution time, and scalability, consistently outperforming existing state-of-the-art PH computation tools when used to process large datasets.

A distributed approach for persistent homology computation on a large scale / Ceccaroni, Riccardo; Di Rocco, Lorenzo; Ferraro Petrillo, Umberto; Brutti, Pierpaolo. - In: THE JOURNAL OF SUPERCOMPUTING. - ISSN 0920-8542. - (2024). [10.1007/s11227-024-06374-5]

A distributed approach for persistent homology computation on a large scale

Ceccaroni, Riccardo;Di Rocco, Lorenzo;Ferraro Petrillo, Umberto;Brutti, Pierpaolo

2024

Abstract

Persistent homology (PH) is a powerful mathematical method to automatically extract relevant insights from images, such as those obtained by high-resolution imaging devices like electron microscopes or new-generation telescopes. However, the application of this method comes at a very high computational cost that is bound to explode more because new imaging devices generate an ever-growing amount of data. In this paper, we present PixHomology, a novel algorithm for efficiently computing zero-dimensional PH on 2D images, optimizing memory and processing time. By leveraging the Apache Spark framework, we also present a distributed version of our algorithm with several optimized variants, able to concurrently process large batches of astronomical images. Finally, we present the results of an experimental analysis showing that our algorithm and its distributed version are efficient in terms of required memory, execution time, and scalability, consistently outperforming existing state-of-the-art PH computation tools when used to process large datasets.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2024
			
	Parole chiave
	
				Persistent homology; Distributed computing; Apache spark; Large-scale image analysis
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				A distributed approach for persistent homology computation on a large scale / Ceccaroni, Riccardo; Di Rocco, Lorenzo; Ferraro Petrillo, Umberto; Brutti, Pierpaolo. - In: THE JOURNAL OF SUPERCOMPUTING. - ISSN 0920-8542. - (2024). [10.1007/s11227-024-06374-5]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Ceccaroni_Distributed_2024.pdf accesso aperto Note: https://doi.org/10.1007/s11227-024-06374-5 Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 2.03 MB Formato Adobe PDF	2.03 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1717606

Citazioni

ND

0

0

social impact