In this paper we study the problem of local triangle counting in large graphs. Namely, given a large graph G = (V;E) we want to estimate as accurately as possible the number of triangles incident to every node v ε V in the graph. The problem of computing the global number of triangles in a graph has been considered before, but to our knowledge this is the first paper that addresses the problem of local triangle counting with a focus on the efficiency issues arising in massive graphs. The distribution of the local number of triangles and the related local clustering coefficient can be used in many interesting applications. For example, we show that the measures we compute can help to detect the presence of spamming activity in large-scale Web graphs, as well as to provide useful features to assess content quality in social networks. For computing the local number of triangles we propose two approximation algorithms, which are based on the idea of min-wise independent permutations (Broder et al. 1998). Our algorithms operate in a semi-streaming fashion, using O(|V|) space in main memory and performing O(log |V|) sequential scans over the edges of the graph. The first algorithm we describe in this paper also uses O(jEj) space in external memory during computation, while the second algorithm uses only main memory. We present the theoretical analysis as well as experimental results in massive graphs demonstrating the practical efficiency of our approach. Copyright 2008 ACM.

Efficient semi-streaming algorithms for local triangle counting in massive graphs / Becchetti, Luca; Paolo, Boldi; Carlos, Castillo; Aristides, Gionis. - (2008), pp. 16-24. (Intervento presentato al convegno 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008 tenutosi a Las Vegas; United States nel 24 August 2008 through 27 August 2008) [10.1145/1401890.1401898].

Efficient semi-streaming algorithms for local triangle counting in massive graphs

BECCHETTI, Luca;
2008

Abstract

In this paper we study the problem of local triangle counting in large graphs. Namely, given a large graph G = (V;E) we want to estimate as accurately as possible the number of triangles incident to every node v ε V in the graph. The problem of computing the global number of triangles in a graph has been considered before, but to our knowledge this is the first paper that addresses the problem of local triangle counting with a focus on the efficiency issues arising in massive graphs. The distribution of the local number of triangles and the related local clustering coefficient can be used in many interesting applications. For example, we show that the measures we compute can help to detect the presence of spamming activity in large-scale Web graphs, as well as to provide useful features to assess content quality in social networks. For computing the local number of triangles we propose two approximation algorithms, which are based on the idea of min-wise independent permutations (Broder et al. 1998). Our algorithms operate in a semi-streaming fashion, using O(|V|) space in main memory and performing O(log |V|) sequential scans over the edges of the graph. The first algorithm we describe in this paper also uses O(jEj) space in external memory during computation, while the second algorithm uses only main memory. We present the theoretical analysis as well as experimental results in massive graphs demonstrating the practical efficiency of our approach. Copyright 2008 ACM.
2008
14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008
graph mining; probabilistic algorithms; semi-streaming
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Efficient semi-streaming algorithms for local triangle counting in massive graphs / Becchetti, Luca; Paolo, Boldi; Carlos, Castillo; Aristides, Gionis. - (2008), pp. 16-24. (Intervento presentato al convegno 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2008 tenutosi a Las Vegas; United States nel 24 August 2008 through 27 August 2008) [10.1145/1401890.1401898].
File allegati a questo prodotto
File Dimensione Formato  
VE_2008_11573-55120.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 291.45 kB
Formato Adobe PDF
291.45 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/55120
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 232
  • ???jsp.display-item.citation.isi??? ND
social impact