Mixture models represent a powerful statistical tool for clustering observations, an essential task in many fields. A multivariate factor analysis regression model (MFARM) can be used to explore the relationship between the observations and predictors, especially when the predictors’ matrices are high dimensional. The disadvantages of MFARMs are generally related to the potential difficulty in interpretability of the resulting factors. A finite mixture of MFARMs is proposed for clustering both observations and predictors. In particular, by replacing the factor loading matrix with a binary row-stochastic matrix in the factor analyzer structure, the predictors can be clustered into groups such that a predictor is only associated with one of the factors. An alternating expectation-conditional maximization algorithm is used for parameter estimation. Application of the proposed approach to both simulated and real datasets is presented and discussed.
Clustering via a finite mixture of disjoint factor analysis model / Qin, Xiaoke; Dang, Sanjeena; Martella, Francesca. - (2023), pp. 117-117. (Intervento presentato al convegno 16th International Conference of the ERCIM (European Research Consortium for Informatics and Mathematics) Working Group on Computational and Methodological Statistics (CMStatistics 2023) tenutosi a Berlino).
Clustering via a finite mixture of disjoint factor analysis model
Francesca Martella
2023
Abstract
Mixture models represent a powerful statistical tool for clustering observations, an essential task in many fields. A multivariate factor analysis regression model (MFARM) can be used to explore the relationship between the observations and predictors, especially when the predictors’ matrices are high dimensional. The disadvantages of MFARMs are generally related to the potential difficulty in interpretability of the resulting factors. A finite mixture of MFARMs is proposed for clustering both observations and predictors. In particular, by replacing the factor loading matrix with a binary row-stochastic matrix in the factor analyzer structure, the predictors can be clustered into groups such that a predictor is only associated with one of the factors. An alternating expectation-conditional maximization algorithm is used for parameter estimation. Application of the proposed approach to both simulated and real datasets is presented and discussed.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.