Abstract—In the last years we are witnessing a progressive shift toward the adoption of the cloud-computing model for a large variety of application domains. High-performance computing (HPC) is not an exception to this trend, with a steadily increasing amount of performance-heavy scientific experiments that are conducted on the cloud. However, this approach may pose some serious security, operational and economical issues. In this paper we report about our experience with this scenario, that culminated with the development of a small-scale on-premise HPC facility called T eraStat, available through our Department. This facility has been purposely deployed and customized so to simplify and accelerate the execution of performance-heavy computational experiments. The success of T eraStat is confirmed by the large number of experiments conducted in these years. Here, we focus on two of the most significant case studies, outlining the positive role played by T eraStat in their solution. The first is a large scale analysis of climate data with annual cycles. The second is a study about the advantages of using compression techniques while processing large genomic data by means of a distributed approach.

High-Performance Computing with TeraStat / Bompiani, Edoardo; Jona-Lasinio, Giovanna; Ferraro-Petrillo, Umberto; Palini, Francesco. - (2020), pp. 487-494. (Intervento presentato al convegno 18th IEEE Conference on Dependable, Autonomic and Secure Computing tenutosi a Calgary, Canada).

High-Performance Computing with TeraStat

Edoardo, Bompiani;Jona-Lasinio, Giovanna;Ferraro-Petrillo, Umberto
;
Francesco, Palini
2020

Abstract

Abstract—In the last years we are witnessing a progressive shift toward the adoption of the cloud-computing model for a large variety of application domains. High-performance computing (HPC) is not an exception to this trend, with a steadily increasing amount of performance-heavy scientific experiments that are conducted on the cloud. However, this approach may pose some serious security, operational and economical issues. In this paper we report about our experience with this scenario, that culminated with the development of a small-scale on-premise HPC facility called T eraStat, available through our Department. This facility has been purposely deployed and customized so to simplify and accelerate the execution of performance-heavy computational experiments. The success of T eraStat is confirmed by the large number of experiments conducted in these years. Here, we focus on two of the most significant case studies, outlining the positive role played by T eraStat in their solution. The first is a large scale analysis of climate data with annual cycles. The second is a study about the advantages of using compression techniques while processing large genomic data by means of a distributed approach.
2020
18th IEEE Conference on Dependable, Autonomic and Secure Computing
HPC; statistical models; Apache Spark
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
High-Performance Computing with TeraStat / Bompiani, Edoardo; Jona-Lasinio, Giovanna; Ferraro-Petrillo, Umberto; Palini, Francesco. - (2020), pp. 487-494. (Intervento presentato al convegno 18th IEEE Conference on Dependable, Autonomic and Secure Computing tenutosi a Calgary, Canada).
File allegati a questo prodotto
File Dimensione Formato  
Bompiani_High-Performance-computing_2020.pdf

solo gestori archivio

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 164.76 kB
Formato Adobe PDF
164.76 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1451125
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 2
social impact