Recent papers have shown that the performance of Time Warp simulators can be improved by appropriately selecting the positions of checkpoints, instead of taking them on a periodic basis. In this paper, we present a checkpointing technique in which the selection of the positions of checkpoints is based on a checkpointing-recovery cost model. Given the current state S, the model determines the convenience of recording S as a checkpoint before the next event is executed. This is done by taking into account the position of the last taken checkpoint, the granularity (i.e., the execution time) of intermediate events, and using an estimate of the probability that S will have to be restored due to rollback in the future of the execution. A synthetic benchmark in different configurations is used for evaluating and comparing this approach to classical periodic techniques. As a testing environment we used a cluster of PCs connected through a Myrinet switch coupled with a fast communication layer specifically designed to exploit the potential of this type of switch. The obtained results point out that our solution allows faster execution and, in some cases, exhibits the additional advantage that less memory is required for recording state vectors. This possibly contributes to further performance improvements when memory is a critical resource for the specific application. A performance study for the case of a cellular phone system simulation is finally reported to demonstrate the effectiveness of this solution for a real world application.

A cost model for selecting checkpoint positions in time warp parallel simulation / Quaglia, Francesco. - In: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. - ISSN 1045-9219. - 12:4(2001), pp. 346-362. [10.1109/71.920586]

A cost model for selecting checkpoint positions in time warp parallel simulation

QUAGLIA, Francesco
2001

Abstract

Recent papers have shown that the performance of Time Warp simulators can be improved by appropriately selecting the positions of checkpoints, instead of taking them on a periodic basis. In this paper, we present a checkpointing technique in which the selection of the positions of checkpoints is based on a checkpointing-recovery cost model. Given the current state S, the model determines the convenience of recording S as a checkpoint before the next event is executed. This is done by taking into account the position of the last taken checkpoint, the granularity (i.e., the execution time) of intermediate events, and using an estimate of the probability that S will have to be restored due to rollback in the future of the execution. A synthetic benchmark in different configurations is used for evaluating and comparing this approach to classical periodic techniques. As a testing environment we used a cluster of PCs connected through a Myrinet switch coupled with a fast communication layer specifically designed to exploit the potential of this type of switch. The obtained results point out that our solution allows faster execution and, in some cases, exhibits the additional advantage that less memory is required for recording state vectors. This possibly contributes to further performance improvements when memory is a critical resource for the specific application. A performance study for the case of a cellular phone system simulation is finally reported to demonstrate the effectiveness of this solution for a real world application.
2001
checkpointing; cost models; optimistic synchronization; parallel discrete-event simulation; performance optimization; rollback-recovery; time warp
01 Pubblicazione su rivista::01a Articolo in rivista
A cost model for selecting checkpoint positions in time warp parallel simulation / Quaglia, Francesco. - In: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS. - ISSN 1045-9219. - 12:4(2001), pp. 346-362. [10.1109/71.920586]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/102965
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 54
  • ???jsp.display-item.citation.isi??? 35
social impact