In this paper, we consider the problem of distributed unsupervised learning where data to be clustered are partitioned over a set of agents having limited connectivity. In order to solve this problem, we consider a novel and extended ensemble clustering procedure in order to make it suitable to a fully distributed scenario. The proposed algorithm can deal with the case where each agent has a local and different dataset. Additionally, to reduce the total amount of exchanged information, only the local prototypes of clusters are forwarded among the neighbors. Cluster similarity indexes are adopted to solve conflicts among agents and to achieve a common structure at the end of the communication process. The experimental results prove the feasibility of this approach, which is able to reach an optimal performance when compared to a fully centralized implementation, that is where data is collected beforehand on a single clustering agent.
A decentralized algorithm for distributed ensemble clustering / Rosato, A.; Altilio, R.; Panella, M.. - In: INFORMATION SCIENCES. - ISSN 0020-0255. - 578:(2021), pp. 417-434. [10.1016/j.ins.2021.07.028]
A decentralized algorithm for distributed ensemble clustering
Rosato A.;Altilio R.;Panella M.
2021
Abstract
In this paper, we consider the problem of distributed unsupervised learning where data to be clustered are partitioned over a set of agents having limited connectivity. In order to solve this problem, we consider a novel and extended ensemble clustering procedure in order to make it suitable to a fully distributed scenario. The proposed algorithm can deal with the case where each agent has a local and different dataset. Additionally, to reduce the total amount of exchanged information, only the local prototypes of clusters are forwarded among the neighbors. Cluster similarity indexes are adopted to solve conflicts among agents and to achieve a common structure at the end of the communication process. The experimental results prove the feasibility of this approach, which is able to reach an optimal performance when compared to a fully centralized implementation, that is where data is collected beforehand on a single clustering agent.File | Dimensione | Formato | |
---|---|---|---|
Rosato_Decentralized-algorithm_2021.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.71 MB
Formato
Adobe PDF
|
1.71 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.