In this paper, we consider the problem of distributed spectral clustering, wherein the data to be clustered is (horizontally) partitioned over a set of interconnected agents with limited connectivity. In order to solve it, we consider the equivalent problem of reconstructing the Euclidean distance matrix of pairwise distances among the joint set of datapoints. This is obtained in a fully decentralized fashion, making use of an innovative distributed gradient-based procedure, where at every agent we interleave gradient steps on a low-rank factorization of the distance matrix, with local averaging steps considering all its neighbors' current estimates. The procedure can be applied to any spectral clustering algorithm, including normalized and unnormalized variations, for multiple choices of the underlying Laplacian matrix. Experimental evaluations demonstrate that the solution is competitive with a fully centralized solver, where data is collected beforehand on a (virtual) coordinating agent.

In this paper, we consider the problem of distributed spectral clustering, wherein the data to be clustered is (horizontally) partitioned over a set of interconnected agents with limited connectivity. In order to solve it, we consider the equivalent problem of reconstructing the Euclidean distance matrix of pairwise distances among the joint set of datapoints. This is obtained in a fully decentralized fashion, making use of an innovative distributed gradient-based procedure, where at every agent we interleave gradient steps on a low-rank factorization of the distance matrix, with local averaging steps considering all its neighbors' current estimates. The procedure can be applied to any spectral clustering algorithm, including normalized and unnormalized variations, for multiple choices of the underlying Laplacian matrix. Experimental evaluations demonstrate that the solution is competitive with a fully centralized solver, where data is collected beforehand on a (virtual) coordinating agent.

Distributed spectral clustering based on Euclidean distance matrix completion / Scardapane, Simone; Altilio, Rosa; Panella, Massimo; Uncini, Aurelio. - STAMPA. - (2016), pp. 3093-3100. (Intervento presentato al convegno International Joint Conference on Neural Networks tenutosi a Vancouver, Canada nel 24-29 luglio 2016) [10.1109/IJCNN.2016.7727593].

Distributed spectral clustering based on Euclidean distance matrix completion

SCARDAPANE, SIMONE;ALTILIO, ROSA;PANELLA, Massimo
;
UNCINI, Aurelio
2016

Abstract

In this paper, we consider the problem of distributed spectral clustering, wherein the data to be clustered is (horizontally) partitioned over a set of interconnected agents with limited connectivity. In order to solve it, we consider the equivalent problem of reconstructing the Euclidean distance matrix of pairwise distances among the joint set of datapoints. This is obtained in a fully decentralized fashion, making use of an innovative distributed gradient-based procedure, where at every agent we interleave gradient steps on a low-rank factorization of the distance matrix, with local averaging steps considering all its neighbors' current estimates. The procedure can be applied to any spectral clustering algorithm, including normalized and unnormalized variations, for multiple choices of the underlying Laplacian matrix. Experimental evaluations demonstrate that the solution is competitive with a fully centralized solver, where data is collected beforehand on a (virtual) coordinating agent.
2016
International Joint Conference on Neural Networks
In this paper, we consider the problem of distributed spectral clustering, wherein the data to be clustered is (horizontally) partitioned over a set of interconnected agents with limited connectivity. In order to solve it, we consider the equivalent problem of reconstructing the Euclidean distance matrix of pairwise distances among the joint set of datapoints. This is obtained in a fully decentralized fashion, making use of an innovative distributed gradient-based procedure, where at every agent we interleave gradient steps on a low-rank factorization of the distance matrix, with local averaging steps considering all its neighbors' current estimates. The procedure can be applied to any spectral clustering algorithm, including normalized and unnormalized variations, for multiple choices of the underlying Laplacian matrix. Experimental evaluations demonstrate that the solution is competitive with a fully centralized solver, where data is collected beforehand on a (virtual) coordinating agent.
Matrix algebra; coordinating agents; distance matrices
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Distributed spectral clustering based on Euclidean distance matrix completion / Scardapane, Simone; Altilio, Rosa; Panella, Massimo; Uncini, Aurelio. - STAMPA. - (2016), pp. 3093-3100. (Intervento presentato al convegno International Joint Conference on Neural Networks tenutosi a Vancouver, Canada nel 24-29 luglio 2016) [10.1109/IJCNN.2016.7727593].
File allegati a questo prodotto
File Dimensione Formato  
Scardapane_Distributed-spectral-clustering_2016.pdf

solo utenti autorizzati

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 213.03 kB
Formato Adobe PDF
213.03 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/913989
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 3
social impact