In this paper the performance of genetic algorithms for solving some clustering problems is investigated through a simulation experiment. If the number of clusters is known in advance, our results show that the genetic algorithm is able to find the right partition, almost irrespective of the genetic parameters selected. Also, the genetic algorithm always performs favourably with respect to the K-means algorithm. On the other hand, if the number of clusters is unknown, the genetic algorithm provides good results as well. Four versions of the genetic algorithm proposed in the literature are compared, and their performances are not found to differ significantly. However, all algorithms have to be supplied with some reasonable positive integer for the maximum number of clusters. Otherwise, the estimated number of clusters is not very near to the true value. Moreover, if the points are not equally partitioned into clusters, the performances deteriorate considerably. On the contrary, other perturbation sources, such as outliers or data errors, do not affect the results.

Genetic clustering algorithms: A comparison simulation study / Baragona, Roberto; Bocci, Laura; Carlo Maria, Medaglia. - In: INTERNATIONAL JOURNAL OF MODELLING & SIMULATION. - ISSN 0228-6203. - 26:3(2006), pp. 190-200. [10.2316/journal.205.2006.3.205-4159]

Genetic clustering algorithms: A comparison simulation study

BARAGONA, Roberto;BOCCI, Laura;
2006

Abstract

In this paper the performance of genetic algorithms for solving some clustering problems is investigated through a simulation experiment. If the number of clusters is known in advance, our results show that the genetic algorithm is able to find the right partition, almost irrespective of the genetic parameters selected. Also, the genetic algorithm always performs favourably with respect to the K-means algorithm. On the other hand, if the number of clusters is unknown, the genetic algorithm provides good results as well. Four versions of the genetic algorithm proposed in the literature are compared, and their performances are not found to differ significantly. However, all algorithms have to be supplied with some reasonable positive integer for the maximum number of clusters. Otherwise, the estimated number of clusters is not very near to the true value. Moreover, if the points are not equally partitioned into clusters, the performances deteriorate considerably. On the contrary, other perturbation sources, such as outliers or data errors, do not affect the results.
2006
latin square design; genetic algorithms; monte carlo simulation; cluster analysis
01 Pubblicazione su rivista::01a Articolo in rivista
Genetic clustering algorithms: A comparison simulation study / Baragona, Roberto; Bocci, Laura; Carlo Maria, Medaglia. - In: INTERNATIONAL JOURNAL OF MODELLING & SIMULATION. - ISSN 0228-6203. - 26:3(2006), pp. 190-200. [10.2316/journal.205.2006.3.205-4159]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/142002
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? ND
social impact