Catalogo dei prodotti della ricerca

Collective operations are cornerstones of both HPC applications and large-scale AI training and inference, yet benchmarking them in a systematic and reproducible way remains difficult on modern systems due to the complexity of their hardware and software stacks. Existing suites primarily report end-to-end timings and offer limited support for controlled algorithm and configuration selection, fine-grained profiling, and capturing the runtime environment. We present PICO (Performance Insights for Collective Operations), an open-source framework that decouples portable experiment setup from platform execution, provides a backend-adaptive parameter selection interface across MPI and NCCL, supplies plain-MPI reference collective implementations, optionally instrumentable, and records the system configuration for reproducible comparisons. Evaluated on three major supercomputers, PICO shows that default collective algorithms and transport settings can be up to 5x slower than the best available choice. It provides diagnostic evidence by isolating topology sensitive algorithmic choices and, through instrumentation, reveals detailed algorithmic breakdowns. To assess end-to-end effects of benchmark-informed tuning and evaluate application-level impacts, we replay open-source LLM training traces in ATLAHS simulator with optimized collective profiles identified by PICO, achieving reductions in training times of up to 44%.

PICO: Performance Insights for Collective Operations / Pasqualoni, Saverio; Bonato, Tommaso; Piarulli, Lorenzo; Torsten Hoefler, And; Canini, Marco; De Sensi, Daniele. - (2026). ( ISC High Performance (was International Supercomputing Conference) Hamburg ).

PICO: Performance Insights for Collective Operations

Saverio Pasqualoni;Tommaso Bonato;Lorenzo Piarulli;and Torsten Hoefler;Marco Canini;Daniele De Sensi

2026

Abstract

Collective operations are cornerstones of both HPC applications and large-scale AI training and inference, yet benchmarking them in a systematic and reproducible way remains difficult on modern systems due to the complexity of their hardware and software stacks. Existing suites primarily report end-to-end timings and offer limited support for controlled algorithm and configuration selection, fine-grained profiling, and capturing the runtime environment. We present PICO (Performance Insights for Collective Operations), an open-source framework that decouples portable experiment setup from platform execution, provides a backend-adaptive parameter selection interface across MPI and NCCL, supplies plain-MPI reference collective implementations, optionally instrumentable, and records the system configuration for reproducible comparisons. Evaluated on three major supercomputers, PICO shows that default collective algorithms and transport settings can be up to 5x slower than the best available choice. It provides diagnostic evidence by isolating topology sensitive algorithmic choices and, through instrumentation, reveals detailed algorithmic breakdowns. To assess end-to-end effects of benchmark-informed tuning and evaluate application-level impacts, we replay open-source LLM training traces in ATLAHS simulator with optimized collective profiles identified by PICO, achieving reductions in training times of up to 44%.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2026
			
	Nome convegno
	
				ISC High Performance (was International Supercomputing Conference)
			
	Parole chiave
	
				collective operations; benchmark
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				PICO: Performance Insights for Collective Operations / Pasqualoni, Saverio; Bonato, Tommaso; Piarulli, Lorenzo; Torsten Hoefler, And; Canini, Marco; De Sensi, Daniele. - (2026). ( ISC High Performance (was International Supercomputing Conference) Hamburg ).

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1763627

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

social impact