In this article, we study the problem of approximate local triangle counting in large graphs. Namely, given a large graph G = (V, E) we want to estimate as accurately as possible the number of triangles incident to every node v. V in the graph. We consider the question both for undirected and directed graphs. The problem of computing the global number of triangles in a graph has been considered before, but to our knowledge this is the first contribution that addresses the problem of approximate local triangle counting with a focus on the efficiency issues arising in massive graphs and that also considers the directed case. The distribution of the local number of triangles and the related local clustering coefficient can be used in many interesting applications. For example, we show that the measures we compute can help detect the presence of spamming activity in large-scale Web graphs, as well as to provide useful features for content quality assessment in social networks. For computing the local number of triangles (undirected and directed), we propose two approximation algorithms, which are based on the idea of min-wise independent permutations [Broder et al. 1998]. Our algorithms operate in a semi-streaming fashion, using O(|V|) space in main memory and performing O(log |V|) sequential scans over the edges of the graph. The first algorithm we describe in this article also uses O(|E|) space of external memory during computation, while the second algorithm uses only main memory. We present the theoretical analysis as well as experimental results on large graphs, demonstrating the practical efficiency of our approach.

Efficient Algorithms for Large-Scale Local Triangle Counting / Becchetti, Luca; Paolo, Boldi; Carlos, Castillo; Aristides, Gionis. - In: ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA. - ISSN 1556-4681. - STAMPA. - 4:3(2010), pp. 1-28. [10.1145/1839490.1839494]

Efficient Algorithms for Large-Scale Local Triangle Counting

BECCHETTI, Luca;
2010

Abstract

In this article, we study the problem of approximate local triangle counting in large graphs. Namely, given a large graph G = (V, E) we want to estimate as accurately as possible the number of triangles incident to every node v. V in the graph. We consider the question both for undirected and directed graphs. The problem of computing the global number of triangles in a graph has been considered before, but to our knowledge this is the first contribution that addresses the problem of approximate local triangle counting with a focus on the efficiency issues arising in massive graphs and that also considers the directed case. The distribution of the local number of triangles and the related local clustering coefficient can be used in many interesting applications. For example, we show that the measures we compute can help detect the presence of spamming activity in large-scale Web graphs, as well as to provide useful features for content quality assessment in social networks. For computing the local number of triangles (undirected and directed), we propose two approximation algorithms, which are based on the idea of min-wise independent permutations [Broder et al. 1998]. Our algorithms operate in a semi-streaming fashion, using O(|V|) space in main memory and performing O(log |V|) sequential scans over the edges of the graph. The first algorithm we describe in this article also uses O(|E|) space of external memory during computation, while the second algorithm uses only main memory. We present the theoretical analysis as well as experimental results on large graphs, demonstrating the practical efficiency of our approach.
2010
clustering coefficient; massive-graph computing; social networks; web com- puting; web computing
01 Pubblicazione su rivista::01a Articolo in rivista
Efficient Algorithms for Large-Scale Local Triangle Counting / Becchetti, Luca; Paolo, Boldi; Carlos, Castillo; Aristides, Gionis. - In: ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA. - ISSN 1556-4681. - STAMPA. - 4:3(2010), pp. 1-28. [10.1145/1839490.1839494]
File allegati a questo prodotto
File Dimensione Formato  
VE_2010_11573-377239.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 873.79 kB
Formato Adobe PDF
873.79 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/377239
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 69
  • ???jsp.display-item.citation.isi??? 49
social impact