Catalogo dei prodotti della ricerca

In several emerging applications, data is collected in massive streams at several distributed points of observation. A basic and challenging task is to allow every node to monitor a neighbourhood of interest by issuing continuous aggregate queries on the streams observed in its vicinity. This class of algorithms is fully decentralized and diffusive in nature: collecting all data at few central nodes of the network is unfeasible in networks of low capability devices or in the presence of massive data sets. The main difficulty in designing diffusive algorithms is to cope with duplicate detections. These arise both from the observation of the same event at several nodes of the network and/or receipt of the same aggregated information along multiple paths of diffusion. In this paper, we consider fully decentralized algorithms that answer locally continuous aggregate queries on the number of distinct events, total number of events and the second frequency moment in the scenario outlined above. The proposed algorithms use in the worst case or on realistic distributions sublinear space at every node. We also propose strategies that minimize the communication needed to update the aggregates when new events are observed. We experimentally evaluate for the efficiency and accuracy of our algorithms on realistic simulated scenarios.

Fully decentralized computation of aggregates over data streams / Becchetti, Luca; Bordino, Ilaria; Leonardi, Stefano; Adi, Rosen. - In: SIGKDD EXPLORATIONS. - ISSN 1931-0145. - 12:2(2010), pp. 83-91. [10.1145/1964897.1964919]

Fully decentralized computation of aggregates over data streams

BECCHETTI, Luca;BORDINO, ILARIA;LEONARDI, Stefano;Adi Rosen

2010

Abstract

In several emerging applications, data is collected in massive streams at several distributed points of observation. A basic and challenging task is to allow every node to monitor a neighbourhood of interest by issuing continuous aggregate queries on the streams observed in its vicinity. This class of algorithms is fully decentralized and diffusive in nature: collecting all data at few central nodes of the network is unfeasible in networks of low capability devices or in the presence of massive data sets. The main difficulty in designing diffusive algorithms is to cope with duplicate detections. These arise both from the observation of the same event at several nodes of the network and/or receipt of the same aggregated information along multiple paths of diffusion. In this paper, we consider fully decentralized algorithms that answer locally continuous aggregate queries on the number of distinct events, total number of events and the second frequency moment in the scenario outlined above. The proposed algorithms use in the worst case or on realistic distributions sublinear space at every node. We also propose strategies that minimize the communication needed to update the aggregates when new events are observed. We experimentally evaluate for the efficiency and accuracy of our algorithms on realistic simulated scenarios.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2010
			
	Parole chiave
	
				Streaming computation; distributed computation; sketching
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Fully decentralized computation of aggregates over data streams / Becchetti, Luca; Bordino, Ilaria; Leonardi, Stefano; Adi, Rosen. - In: SIGKDD EXPLORATIONS. - ISSN 1931-0145. - 12:2(2010), pp. 83-91. [10.1145/1964897.1964919]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Becchetti_Preprint-Fully_2010.pdf accesso aperto Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 478.59 kB Formato Adobe PDF	478.59 kB	Adobe PDF
Becchetti_Fully_2010.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 569.73 kB Formato Adobe PDF Contatta l'autore	569.73 kB	Adobe PDF	Contatta l'autore
VE_2010_11573-378040.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 529.58 kB Formato Adobe PDF Contatta l'autore	529.58 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/378040

Citazioni

ND

ND

ND

social impact