Catalogo dei prodotti della ricerca

Counting graphlets is a well-studied problem in graph mining and social network analysis. Recently, several papers explored very simple and natural approaches based on Monte Carlo sampling of Markov Chains (MC), and reported encouraging results. We show, perhaps surprisingly, that this approach is outperformed by a carefully engineered version of color coding (CC) [1], a sophisticated algorithmic technique that we extend to the case of graphlet sampling and for which we prove strong statistical guarantees. Our computational experiments on graphs with millions of nodes show CC to be more accurate than MC. Furthermore, we formally show that the mixing time of the MC approach is too high in general, even when the input graph has high conductance. All this comes at a price however. While MC is very efficient in terms of space, CC's memory requirements become demanding when the size of the input graph and that of the graphlets grow. And yet, our experiments show that a careful implementation of CC can push the limits of the state of the art, both in terms of the size of the input graph and of that of the graphlets

Counting graphlets: space vs time / Bressan, Marco; Chierichetti, Flavio; Kumar, Ravi; Leucci, Stefano; Panconesi, Alessandro. - ELETTRONICO. - (2017), pp. 557-566. ( 10th ACM International Conference on Web Search and Data Mining, WSDM 2017 cambridge 2017) [10.1145/3018661.3018732].

Counting graphlets: space vs time

BRESSAN, MARCO;CHIERICHETTI, FLAVIO;Kumar, Ravi;LEUCCI, STEFANO;PANCONESI, Alessandro

2017

Abstract

Counting graphlets is a well-studied problem in graph mining and social network analysis. Recently, several papers explored very simple and natural approaches based on Monte Carlo sampling of Markov Chains (MC), and reported encouraging results. We show, perhaps surprisingly, that this approach is outperformed by a carefully engineered version of color coding (CC) [1], a sophisticated algorithmic technique that we extend to the case of graphlet sampling and for which we prove strong statistical guarantees. Our computational experiments on graphs with millions of nodes show CC to be more accurate than MC. Furthermore, we formally show that the mixing time of the MC approach is too high in general, even when the input graph has high conductance. All this comes at a price however. While MC is very efficient in terms of space, CC's memory requirements become demanding when the size of the input graph and that of the graphlets grow. And yet, our experiments show that a careful implementation of CC can push the limits of the state of the art, both in terms of the size of the input graph and of that of the graphlets

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2017
			
	Nome convegno
	
				10th ACM International Conference on Web Search and Data Mining, WSDM 2017
			
	Parole chiave
	
				Computer Science Applications; Computer Vision and Pattern Recognition; Information Systems; Computer Networks and Communications; Software
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Counting graphlets: space vs time / Bressan, Marco; Chierichetti, Flavio; Kumar, Ravi; Leucci, Stefano; Panconesi, Alessandro. - ELETTRONICO. - (2017), pp. 557-566. ( 10th ACM International Conference on Web Search and Data Mining, WSDM 2017 cambridge 2017) [10.1145/3018661.3018732].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Bressan_counting_2017.pdf solo gestori archivio Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 484.09 kB Formato Adobe PDF Contatta l'autore	484.09 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/978478

Citazioni

ND

66

52

social impact