In this paper we present the first algorithm to compute the Strongly Connected Components of a graph in the datastream model (W-Stream), where the graph is represented by a stream of edges and we are allowed to produce intermediate output streams. The algorithm is simple, effective, and can be implemented with few lines of code: it looks at each edge in the stream, and selects the appropriate action with respect to a tree T, representing the graph connectivity seen so far. We analyze the theoretical properties of the algorithm: correctness, memory occupation (O(n logn)), per item processing time (bounded by the current height of T), and number of passes (bounded by the maximal height of T). We conclude by presenting a brief experimental evaluation of the algorithm against massive synthetic and real graphs that confirms its effectiveness: with graphs with up to 100M nodes and 4G edges, only few passes are needed, and millions of edges per second are processed.

Computing Strongly Connected Components in the Streaming Model / Laura, Luigi; F., Santaroni. - 6595:(2011), pp. 193-205. (Intervento presentato al convegno First International ICST Conference, TAPAS 2011, Rome, Italy, tenutosi a Rome, Italy nel April 18-20, 2011.) [10.1007/978-3-642-19754-3_20].

Computing Strongly Connected Components in the Streaming Model.

LAURA, Luigi;
2011

Abstract

In this paper we present the first algorithm to compute the Strongly Connected Components of a graph in the datastream model (W-Stream), where the graph is represented by a stream of edges and we are allowed to produce intermediate output streams. The algorithm is simple, effective, and can be implemented with few lines of code: it looks at each edge in the stream, and selects the appropriate action with respect to a tree T, representing the graph connectivity seen so far. We analyze the theoretical properties of the algorithm: correctness, memory occupation (O(n logn)), per item processing time (bounded by the current height of T), and number of passes (bounded by the maximal height of T). We conclude by presenting a brief experimental evaluation of the algorithm against massive synthetic and real graphs that confirms its effectiveness: with graphs with up to 100M nodes and 4G edges, only few passes are needed, and millions of edges per second are processed.
2011
First International ICST Conference, TAPAS 2011, Rome, Italy,
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Computing Strongly Connected Components in the Streaming Model / Laura, Luigi; F., Santaroni. - 6595:(2011), pp. 193-205. (Intervento presentato al convegno First International ICST Conference, TAPAS 2011, Rome, Italy, tenutosi a Rome, Italy nel April 18-20, 2011.) [10.1007/978-3-642-19754-3_20].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/759814
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 5
social impact