Mixture models represent a powerful statistical tool for clustering observations which is an essential task in many elds, such as machine learning, data analysis, and pattern recognition. Multivariate factor analysis regression model (MFARM) can be used to explore the relationship between the observations and predictors, especially when the predictors matrices are of high dimension or contain multicollinearity. Disadvantages of MFARMs are generally related to the potential difculty in interpretability of the resulting factors. Here, we propose a nite mixture of MFARMs for clustering both observations and predictors that similarly predict the responses. In particular, by replacing the factor loading matrix with a binary row- stochastic matrix in the factor analyzer structure, the predictors that similarly predict the responses can be clustered into groups such that a predictor is only associated with one of the factors. An alternating expectation-conditional maximization algorithm is used for parameter estimation. Application of the proposed approach to both simulated and real datasets is presented and discussed.
Clustering via finite mixture of multivariate factor analysis regression models with clustered predictors / Qin, Xiaoke; Tu, Wangshu; Martella, Francesca; Dang Subedi, Sanjeena. - (2023). (Intervento presentato al convegno the 2023 Classification Society Annual Meeting tenutosi a Rochester, NY, USA).
Clustering via finite mixture of multivariate factor analysis regression models with clustered predictors.
Francesca Martella;
2023
Abstract
Mixture models represent a powerful statistical tool for clustering observations which is an essential task in many elds, such as machine learning, data analysis, and pattern recognition. Multivariate factor analysis regression model (MFARM) can be used to explore the relationship between the observations and predictors, especially when the predictors matrices are of high dimension or contain multicollinearity. Disadvantages of MFARMs are generally related to the potential difculty in interpretability of the resulting factors. Here, we propose a nite mixture of MFARMs for clustering both observations and predictors that similarly predict the responses. In particular, by replacing the factor loading matrix with a binary row- stochastic matrix in the factor analyzer structure, the predictors that similarly predict the responses can be clustered into groups such that a predictor is only associated with one of the factors. An alternating expectation-conditional maximization algorithm is used for parameter estimation. Application of the proposed approach to both simulated and real datasets is presented and discussed.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


