CCL (Checkpointing and Communication Library) is a recently developed software in support of optimistic parallel simulation on myrinet based clusters. Beyond classical low latency message delivery functionalities, this library implements CPU offloaded, semi-asynchronous checkpointing functionalities based on data transfer capabilities provided by a programmable DMA engine on board of myrinet network cards. The latest version of CCL (v2.4), designed for M2M-PCI32C myrinet cards, only supports monoprogrammed semi-asynchronous checkpoints. This forces re-synchronization between CPU and DMA activities each time a new checkpoint request must be issued at the simulation application level while the last issued one is still being carried out by the DMA engine. In this paper we present CCL v3.0 that, exploiting hardware features of more advanced M3M-PCI64C myrinet cards, supports multiprogrammed semi-asynchronous checkpoints. The multiprogrammed approach allows higher degree of concurrency between checkpointing and other simulation specific operations carried out by the CPU, with obvious benefits on performance. We also report the results of the evaluation of those benefits for the case of a personal communication system simulation application.

CCL v3.0: Multiprogrammed semi-asynchronous checkpoints / Quaglia, Francesco; A., Santoro. - (2003), pp. 21-28. (Intervento presentato al convegno 17th Workshop on Parallel and Distributed Simulation tenutosi a SAN DIEGO, CA nel JUN 10-13, 2003) [10.1109/pads.2003.1207417].

CCL v3.0: Multiprogrammed semi-asynchronous checkpoints

QUAGLIA, Francesco;
2003

Abstract

CCL (Checkpointing and Communication Library) is a recently developed software in support of optimistic parallel simulation on myrinet based clusters. Beyond classical low latency message delivery functionalities, this library implements CPU offloaded, semi-asynchronous checkpointing functionalities based on data transfer capabilities provided by a programmable DMA engine on board of myrinet network cards. The latest version of CCL (v2.4), designed for M2M-PCI32C myrinet cards, only supports monoprogrammed semi-asynchronous checkpoints. This forces re-synchronization between CPU and DMA activities each time a new checkpoint request must be issued at the simulation application level while the last issued one is still being carried out by the DMA engine. In this paper we present CCL v3.0 that, exploiting hardware features of more advanced M3M-PCI64C myrinet cards, supports multiprogrammed semi-asynchronous checkpoints. The multiprogrammed approach allows higher degree of concurrency between checkpointing and other simulation specific operations carried out by the CPU, with obvious benefits on performance. We also report the results of the evaluation of those benefits for the case of a personal communication system simulation application.
2003
9780769519708
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/61394
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 3
social impact