Gaussian Mixture Models (GMMs) are one of the most widespread methodologies for model-based clustering. They assume a multivariate Gaussian distribution for each component of the mixture, centered at the mean vector and with volume, shape and orientation derived by the covariance matrix. To reduce the large number of parameters produced by the covariance matrices, parsimonious parameterizations of the latter were proposed in literature, e.g., the eigen-decomposition and the parsimonious GMMs based on mixtures of probabilistic principal component analyzers and mixtures of factor analyzers. We introduce a new parameterization of a covariance matrix by defining an extended ultrametric covariance matrix and we implement it into a GMM. This structure can be used to describe multidimensional phenomena which are characterized by nested latent concepts having different levels of abstraction, from the most specific to the most general. The proposal is able to pinpoint a hierarchical structure on variables for each component of the GMM, thus identifying a different characterization of a multidimensional phenomenon for each component (cluster, subpopulation) of the mixture. At the same time, it defines a new parsimonious GMM since the ultrametric covariance structure reconstructs the relationships among variables with a limited number of parameters. The proposal is applied on synthetic and real data. On the former it shows good performance in terms of classification when compared to the other existing parameterizations, and on the latter it also provides insight into the hierarchical relationships among the variables for each cluster.

Gaussian mixture model with an extended ultrametric covariance structure / Cavicchia, Carlo; Vichi, Maurizio; Zaccaria, Giorgia. - In: ADVANCES IN DATA ANALYSIS AND CLASSIFICATION. - ISSN 1862-5355. - 16:2(2022), pp. 399-427. [10.1007/s11634-021-00488-x]

Gaussian mixture model with an extended ultrametric covariance structure

Carlo Cavicchia;Maurizio Vichi;GIorgia Zaccaria
2022

Abstract

Gaussian Mixture Models (GMMs) are one of the most widespread methodologies for model-based clustering. They assume a multivariate Gaussian distribution for each component of the mixture, centered at the mean vector and with volume, shape and orientation derived by the covariance matrix. To reduce the large number of parameters produced by the covariance matrices, parsimonious parameterizations of the latter were proposed in literature, e.g., the eigen-decomposition and the parsimonious GMMs based on mixtures of probabilistic principal component analyzers and mixtures of factor analyzers. We introduce a new parameterization of a covariance matrix by defining an extended ultrametric covariance matrix and we implement it into a GMM. This structure can be used to describe multidimensional phenomena which are characterized by nested latent concepts having different levels of abstraction, from the most specific to the most general. The proposal is able to pinpoint a hierarchical structure on variables for each component of the GMM, thus identifying a different characterization of a multidimensional phenomenon for each component (cluster, subpopulation) of the mixture. At the same time, it defines a new parsimonious GMM since the ultrametric covariance structure reconstructs the relationships among variables with a limited number of parameters. The proposal is applied on synthetic and real data. On the former it shows good performance in terms of classification when compared to the other existing parameterizations, and on the latter it also provides insight into the hierarchical relationships among the variables for each cluster.
2022
ultrametric matrices; parsimonious models; cluster analysis; hierarchical models
01 Pubblicazione su rivista::01a Articolo in rivista
Gaussian mixture model with an extended ultrametric covariance structure / Cavicchia, Carlo; Vichi, Maurizio; Zaccaria, Giorgia. - In: ADVANCES IN DATA ANALYSIS AND CLASSIFICATION. - ISSN 1862-5355. - 16:2(2022), pp. 399-427. [10.1007/s11634-021-00488-x]
File allegati a questo prodotto
File Dimensione Formato  
Cavicchia_Gaussian-mixture-model_2022.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.18 MB
Formato Adobe PDF
1.18 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1615269
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 4
social impact