In this paper we propose a novel evolutive agent-based clustering algorithm where agents act as individuals of an evolving population, each one performing a random walk on a different subset of patterns drawn from the entire dataset. Such agents are orchestrated by means of a customised genetic algorithm and are able to perform simultaneously clustering and feature selection. Conversely to standard clustering algorithms, each agent is in charge of discovering well-formed (compact and populated) clusters and, at the same time, a suitable subset of features corresponding to the subspace where such clusters lie, following a local metric learning approach, where each cluster is characterised by its own subset of relevant features. This will not only lead to a deeper knowledge of the dataset at hand, revealing clusters that are not evident when using the whole set of features, but will also be suitable for large datasets, as each agent will process a small subset of patterns. We show the effectiveness of our algorithm on synthetic datasets, remarking some interesting future work scenarios and extensions.
Data mining by evolving agents for clusters discovery and metric learning / Martino, Alessio; Giampieri, Mauro; Luzi, Massimiliano; Rizzi, Antonello. - STAMPA. - (2019), pp. 23-35. - SMART INNOVATION, SYSTEMS AND TECHNOLOGIES. [10.1007/978-3-319-95098-3_3].
Data mining by evolving agents for clusters discovery and metric learning
Alessio Martino;Mauro Giampieri;Massimiliano Luzi;Antonello Rizzi
2019
Abstract
In this paper we propose a novel evolutive agent-based clustering algorithm where agents act as individuals of an evolving population, each one performing a random walk on a different subset of patterns drawn from the entire dataset. Such agents are orchestrated by means of a customised genetic algorithm and are able to perform simultaneously clustering and feature selection. Conversely to standard clustering algorithms, each agent is in charge of discovering well-formed (compact and populated) clusters and, at the same time, a suitable subset of features corresponding to the subspace where such clusters lie, following a local metric learning approach, where each cluster is characterised by its own subset of relevant features. This will not only lead to a deeper knowledge of the dataset at hand, revealing clusters that are not evident when using the whole set of features, but will also be suitable for large datasets, as each agent will process a small subset of patterns. We show the effectiveness of our algorithm on synthetic datasets, remarking some interesting future work scenarios and extensions.File | Dimensione | Formato | |
---|---|---|---|
Martino_indice_Data-mining_2019.pdf
solo gestori archivio
Tipologia:
Altro materiale allegato
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
205.74 kB
Formato
Adobe PDF
|
205.74 kB | Adobe PDF | Contatta l'autore |
Martino_Data-mining_2019.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
285.75 kB
Formato
Adobe PDF
|
285.75 kB | Adobe PDF | Contatta l'autore |
Martino_cover_Data-mining_2019.pdf
solo gestori archivio
Tipologia:
Altro materiale allegato
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
268.31 kB
Formato
Adobe PDF
|
268.31 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.