Twelve parsimonious models for clustering mixed-type (ordinal and continuous) data are proposed. Ordinal and continuous data are assumed to follow a multivariate finite mixture of Gaussians. Two main closely related issues should be faced with when the dimensionality of the data increases: the number of parameters increases exponentially; a large number of ordinal variables makes the full maximum likelihood estimation infeasible. To solve the first issue, the model should be more parsimonious in terms of the number of parameters to estimate. At this aim, a general class of eight parsimonious mixture models for mixed-type data are defined by imposing a factor decomposition on component-specific covariance matrices. The loadings and variances of error terms of the factor model may be constrained to be equal or unequal across mixture components. To add some extra flexibility to maintain a certain degree of parsimony, four further models are defined, where the latent factors in each cluster are the same but with different variances. A nice feature of these semi-constrained models is that, under mild conditions, the factors are unique. In other terms, it is impossible to rotate the factors as in the classical factor analysis model. To solve the second issue, a composite likelihood approach is adopted. Estimates computation is carried out using an EM-type algorithm based on composite likelihood. The proposal is evaluated through a simulation study and an application to real data.

Parsimonious and semi-constrained models for clustering mixed-type data through a composite likelihood approach / Ranalli, Monia; Rocci, Roberto. - (2023), pp. 116-116. (Intervento presentato al convegno Econometrics and Statistics (EcoSta 2023) tenutosi a Tokyo, Japan).

Parsimonious and semi-constrained models for clustering mixed-type data through a composite likelihood approach

Monia Ranalli
;
Roberto Rocci
Secondo
2023

Abstract

Twelve parsimonious models for clustering mixed-type (ordinal and continuous) data are proposed. Ordinal and continuous data are assumed to follow a multivariate finite mixture of Gaussians. Two main closely related issues should be faced with when the dimensionality of the data increases: the number of parameters increases exponentially; a large number of ordinal variables makes the full maximum likelihood estimation infeasible. To solve the first issue, the model should be more parsimonious in terms of the number of parameters to estimate. At this aim, a general class of eight parsimonious mixture models for mixed-type data are defined by imposing a factor decomposition on component-specific covariance matrices. The loadings and variances of error terms of the factor model may be constrained to be equal or unequal across mixture components. To add some extra flexibility to maintain a certain degree of parsimony, four further models are defined, where the latent factors in each cluster are the same but with different variances. A nice feature of these semi-constrained models is that, under mild conditions, the factors are unique. In other terms, it is impossible to rotate the factors as in the classical factor analysis model. To solve the second issue, a composite likelihood approach is adopted. Estimates computation is carried out using an EM-type algorithm based on composite likelihood. The proposal is evaluated through a simulation study and an application to real data.
2023
Econometrics and Statistics (EcoSta 2023)
04 Pubblicazione in atti di convegno::04d Abstract in atti di convegno
Parsimonious and semi-constrained models for clustering mixed-type data through a composite likelihood approach / Ranalli, Monia; Rocci, Roberto. - (2023), pp. 116-116. (Intervento presentato al convegno Econometrics and Statistics (EcoSta 2023) tenutosi a Tokyo, Japan).
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1728024
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact