Multi-GPU nodes are increasingly common in the rapidly evolving landscape of exascale supercomputers. On these systems, GPUs on the same node are connected through dedicated networks, with bandwidths up to a few terabits per second. However, gauging performance expectations and maximizing system efficiency is challenging due to different technologies, design options, and software layers. This paper comprehensively characterizes three supercomputers - Alps, Leonardo, and LUMI - each with a unique architecture and design. We focus on performance evaluation of intra-node and inter-node interconnects on up to 4,096 GPUs, using a mix of intra-node and inter-node benchmarks. By analyzing its limitations and opportunities, we aim to offer practical guidance to researchers, system architects, and software developers dealing with multi-GPU supercomputing. Our results show that there is untapped bandwidth, and there are still many opportunities for optimization, ranging from network to software optimization.

Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects / De Sensi, D.; Pichetti, L.; Vella, F.; De Matteis, T.; Ren, Z.; Fusco, L.; Turisini, M.; Cesarini, D.; Lust, K.; Trivedi, A.; Roweth, D.; Spiga, F.; Di Girolamo, S.; Hoefler, T.. - (2024), pp. 1-15. ( International Conference for High Performance Computing, Networking, Storage and Analysis (was Supercomputing Conference) Georgia World Congress Center, usa ) [10.1109/SC41406.2024.00039].

Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects

De Sensi D.
Primo
;
2024

Abstract

Multi-GPU nodes are increasingly common in the rapidly evolving landscape of exascale supercomputers. On these systems, GPUs on the same node are connected through dedicated networks, with bandwidths up to a few terabits per second. However, gauging performance expectations and maximizing system efficiency is challenging due to different technologies, design options, and software layers. This paper comprehensively characterizes three supercomputers - Alps, Leonardo, and LUMI - each with a unique architecture and design. We focus on performance evaluation of intra-node and inter-node interconnects on up to 4,096 GPUs, using a mix of intra-node and inter-node benchmarks. By analyzing its limitations and opportunities, we aim to offer practical guidance to researchers, system architects, and software developers dealing with multi-GPU supercomputing. Our results show that there is untapped bandwidth, and there are still many opportunities for optimization, ranging from network to software optimization.
2024
International Conference for High Performance Computing, Networking, Storage and Analysis (was Supercomputing Conference)
supercomputer; gpu; interconnect
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects / De Sensi, D.; Pichetti, L.; Vella, F.; De Matteis, T.; Ren, Z.; Fusco, L.; Turisini, M.; Cesarini, D.; Lust, K.; Trivedi, A.; Roweth, D.; Spiga, F.; Di Girolamo, S.; Hoefler, T.. - (2024), pp. 1-15. ( International Conference for High Performance Computing, Networking, Storage and Analysis (was Supercomputing Conference) Georgia World Congress Center, usa ) [10.1109/SC41406.2024.00039].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1737589
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 2
social impact