Cloud computing represents an appealing opportunity for cost-effective deployment of HPC workloads on the best-fitting hardware. However, although cloud and on-premise HPC systems offer similar computational resources, their network architecture and performance may differ significantly. For example, these systems use fundamentally different network transport and routing protocols, which may introduce \textit{network noise} that can eventually limit the application scaling. This work analyzes network performance, scalability, and cost of running HPC workloads on cloud systems. First, we consider latency, bandwidth, and collective communication patterns in detailed small-scale measurements, and then we simulate network performance at a larger scale. We validate our approach on four popular cloud providers and three on-premise HPC systems, showing that network (and also OS) noise can significantly impact performance and cost both at small and large scale.

Noise in the Clouds: Influence of Network Performance Variability on Application Scalability / DE SENSI, Daniele; De Matteis, Tiziano; Taranov, Konstantin; Di Girolamo, Salvatore; Rahn, Tobias; Hoefler, Torsten. - (2022).

Noise in the Clouds: Influence of Network Performance Variability on Application Scalability

Daniele De Sensi
Primo
;
2022

Abstract

Cloud computing represents an appealing opportunity for cost-effective deployment of HPC workloads on the best-fitting hardware. However, although cloud and on-premise HPC systems offer similar computational resources, their network architecture and performance may differ significantly. For example, these systems use fundamentally different network transport and routing protocols, which may introduce \textit{network noise} that can eventually limit the application scaling. This work analyzes network performance, scalability, and cost of running HPC workloads on cloud systems. First, we consider latency, bandwidth, and collective communication patterns in detailed small-scale measurements, and then we simulate network performance at a larger scale. We validate our approach on four popular cloud providers and three on-premise HPC systems, showing that network (and also OS) noise can significantly impact performance and cost both at small and large scale.
2022
Proc. ACM Meas. Anal. Comput. Syst.
cloud; HPC; network noise; scalability
02 Pubblicazione su volume::02a Capitolo o Articolo
Noise in the Clouds: Influence of Network Performance Variability on Application Scalability / DE SENSI, Daniele; De Matteis, Tiziano; Taranov, Konstantin; Di Girolamo, Salvatore; Rahn, Tobias; Hoefler, Torsten. - (2022).
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1661239
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 5
social impact