The aim of this study is to analyze different strategies to cluster large data sets derived from social context. For the purpose of clustering, trials on effective and efficient methods for large databases have only been carried out in recent years due to the emergence of the field of data mining. In this paper a sequential approach based on multiobjective genetic algorithm as clustering technique is proposed. The proposed strategy is applied to a real-life data set consisting of approximately 1.5 million workers and the results are compared with those obtained by other methods to find out an unambiguous partitioning of data.

The aim of this study is to analyze different strategies to cluster large data sets derived from social context. For the purpose of clustering, trials on effective and efficient methods for large databases have only been carried out in recent years due to the emergence of the field of data mining. In this paper a sequential approach based on multiobjective genetic algorithm as clustering technique is proposed. The proposed strategy is applied to a real-life data set consisting of approximately 1.5 million workers and the results are compared with those obtained by other methods to find out an unambiguous partitioning of data.

Clustering Large Data Set: An Applied Comparative Study / Bocci, Laura; Mingo, Isabella. - STAMPA. - 1(2012), pp. 3-12. [10.1007/978-3-642-21037-2_1].

Clustering Large Data Set: An Applied Comparative Study

BOCCI, Laura;MINGO, Isabella
2012

Abstract

The aim of this study is to analyze different strategies to cluster large data sets derived from social context. For the purpose of clustering, trials on effective and efficient methods for large databases have only been carried out in recent years due to the emergence of the field of data mining. In this paper a sequential approach based on multiobjective genetic algorithm as clustering technique is proposed. The proposed strategy is applied to a real-life data set consisting of approximately 1.5 million workers and the results are compared with those obtained by other methods to find out an unambiguous partitioning of data.
2012
Advanced Statistical Methods for the Analysis of Large Data-Sets
9783642210365
9783642210372
The aim of this study is to analyze different strategies to cluster large data sets derived from social context. For the purpose of clustering, trials on effective and efficient methods for large databases have only been carried out in recent years due to the emergence of the field of data mining. In this paper a sequential approach based on multiobjective genetic algorithm as clustering technique is proposed. The proposed strategy is applied to a real-life data set consisting of approximately 1.5 million workers and the results are compared with those obtained by other methods to find out an unambiguous partitioning of data.
cluster analysis; computational statistic; work-flexibility
02 Pubblicazione su volume::02a Capitolo o Articolo
Clustering Large Data Set: An Applied Comparative Study / Bocci, Laura; Mingo, Isabella. - STAMPA. - 1(2012), pp. 3-12. [10.1007/978-3-642-21037-2_1].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/399204
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact