Given a checkpoint and communication pattern of a distributed execution, the No Z-Cycle property (NZC) states that a dependency between a checkpoint and itself does not exist. In other words, a noncausal sequence of messages that starts after a checkpoint and terminates before that checkpoint does not exist. From an operational point of view, this property corresponds to the fact that each checkpoint belongs to at least one consistent global checkpoint. So it could be used, for example, for restarting a distributed application after the occurrence of a failure. In this paper we derive a characterization of the NZC property (previously an open problem). It identifies a subset of Z-cycles, namely core Z-cycles (CZCs), that has to be empty in order that the checkpoint and communication pattern of the execution satisfies the NZC property. Then, we present a communication-induced checkpointing protocol that prevents CZCs on-the-fly. This protocol actually removes the common causal part to any CZC. Finally we propose a taxonomy of communication-induced checkpointing protocols that ensure the, NZC property.
On the No-Z-Cycle property in distributed executions / Baldoni, Roberto; Quaglia, Francesco; Ciciani, Bruno. - In: JOURNAL OF COMPUTER AND SYSTEM SCIENCES. - ISSN 0022-0000. - 61:3(2000), pp. 400-427. (Intervento presentato al convegno 17th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems tenutosi a Seattle, WA, USA nel 1 June 1998 through 4 June 1998) [10.1006/jcss.2000.1720].
On the No-Z-Cycle property in distributed executions
BALDONI, Roberto;QUAGLIA, Francesco;Bruno Ciciani
2000
Abstract
Given a checkpoint and communication pattern of a distributed execution, the No Z-Cycle property (NZC) states that a dependency between a checkpoint and itself does not exist. In other words, a noncausal sequence of messages that starts after a checkpoint and terminates before that checkpoint does not exist. From an operational point of view, this property corresponds to the fact that each checkpoint belongs to at least one consistent global checkpoint. So it could be used, for example, for restarting a distributed application after the occurrence of a failure. In this paper we derive a characterization of the NZC property (previously an open problem). It identifies a subset of Z-cycles, namely core Z-cycles (CZCs), that has to be empty in order that the checkpoint and communication pattern of the execution satisfies the NZC property. Then, we present a communication-induced checkpointing protocol that prevents CZCs on-the-fly. This protocol actually removes the common causal part to any CZC. Finally we propose a taxonomy of communication-induced checkpointing protocols that ensure the, NZC property.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.