CCL (Checkpointing and Communication Library) is a recently developed software in support of optimistic parallel simulation on myrinet based clusters. Beyond classical low latency message delivery functionalities, this library implements CPU offloaded, semi-asynchronous checkpointing functionalities based on data transfer capabilities provided by a programmable DMA engine on board of myrinet network cards. The latest version of CCL (v2.4), designed for M2M-PCI32C myrinet cards, only supports monoprogrammed semi-asynchronous checkpoints. This forces re-synchronization between CPU and DMA activities each time a new checkpoint request must be issued at the simulation application level while the last issued one is still being carried out by the DMA engine. In this paper we present CCL v3.0 that, exploiting hardware features of more advanced M3M-PCI64C myrinet cards, supports multiprogrammed semi-asynchronous checkpoints. The multiprogrammed approach allows higher degree of concurrency between checkpointing and other simulation specific operations carried out by the CPU, with obvious benefits on performance. We also report the results of the evaluation of those benefits for the case of a personal communication system simulation application.
Scheda prodotto non validato
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo
|Titolo:||CCL v3.0: Multiprogrammed semi-asynchronous checkpoints|
|Data di pubblicazione:||2003|
|Appartiene alla tipologia:||04a Atto di comunicazione a congresso|