In the unsupervised classification field, the unknown number of clusters and the lack of assessment and interpretability of the final partition by means of inferential tools denote important limitations that could negatively influence the reliability of the final results. In this work, we propose to combine unsupervised classification with supervised methods in order to enhance the assessment and interpretation of the obtained partition. In particular, the approach consists in combining of the clustering method k-means (KM) with logistic regression (LR) modeling to have an algorithm that allows an evaluation of the partition identified through KM, to assess the correct number of clusters, and to verify the selection of the most important variables. An application on real data is presented to better clarify the utility of the proposed approach.

Simultaneous Supervised and Unsupervised Classification Modeling for Assessing Cluster Analysis and Improving Results Interpretability / Fordellone, Mario; Vichi, Maurizio. - (2019), pp. 23-31.

Simultaneous Supervised and Unsupervised Classification Modeling for Assessing Cluster Analysis and Improving Results Interpretability

Fordellone Mario
;
Vichi Maurizio
2019

Abstract

In the unsupervised classification field, the unknown number of clusters and the lack of assessment and interpretability of the final partition by means of inferential tools denote important limitations that could negatively influence the reliability of the final results. In this work, we propose to combine unsupervised classification with supervised methods in order to enhance the assessment and interpretation of the obtained partition. In particular, the approach consists in combining of the clustering method k-means (KM) with logistic regression (LR) modeling to have an algorithm that allows an evaluation of the partition identified through KM, to assess the correct number of clusters, and to verify the selection of the most important variables. An application on real data is presented to better clarify the utility of the proposed approach.
2019
Statistical Learning of Complex Data
978-3-030-21140-0
Supervised classification, Unsupervised classification, Assessing clustering
02 Pubblicazione su volume::02a Capitolo o Articolo
Simultaneous Supervised and Unsupervised Classification Modeling for Assessing Cluster Analysis and Improving Results Interpretability / Fordellone, Mario; Vichi, Maurizio. - (2019), pp. 23-31.
File allegati a questo prodotto
File Dimensione Formato  
FordelloneVichi dicembre.pdf

accesso aperto

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 286.42 kB
Formato Adobe PDF
286.42 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1357589
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact