Cluster-weighted factor analyzers (CWFA) are a versatile class of mixture models designed to estimate the joint distribution of a random vector that includes a re- sponse variable along with a set of explanatory variables. They are particularly valuable in situations involving high dimensionality. This paper enhances CWFA models in two no- table ways. First, it enables the prediction of multiple response variables while considering their potential interactions. Second, it identifies factors associated with disjoint groups of explanatory variables, thereby improving interpretability. This development leads to the in- troduction of the multivariate cluster-weighted disjoint factor analyzers (MCWDFA) model. An alternating expectation-conditional maximization algorithm is employed for parameter estimation. The effectiveness of the proposed model is assessed through an extensive sim- ulation study that examines various scenarios. The proposal is applied to crime data from the United States, sourced from the UCI Machine Learning Repository, with the aim of capturing potential latent heterogeneity within communities and identifying groups of socio- economic features that are similarly associated with factors predicting crime rates. Results provide valuable insights into the underlying structures influencing crime rates, which may potentially be helpful for effective cluster-specific policymaking and social interventions.
Extending Cluster-Weighted Factor Analyzers for multivariate prediction and interpretability / Qin, Xiaoke; Martella, Francesca; Subedi, Sanjeena. - In: JOURNAL OF CLASSIFICATION. - ISSN 1432-1343. - (2025), pp. 1-29.
Extending Cluster-Weighted Factor Analyzers for multivariate prediction and interpretability
Francesca Martella;
2025
Abstract
Cluster-weighted factor analyzers (CWFA) are a versatile class of mixture models designed to estimate the joint distribution of a random vector that includes a re- sponse variable along with a set of explanatory variables. They are particularly valuable in situations involving high dimensionality. This paper enhances CWFA models in two no- table ways. First, it enables the prediction of multiple response variables while considering their potential interactions. Second, it identifies factors associated with disjoint groups of explanatory variables, thereby improving interpretability. This development leads to the in- troduction of the multivariate cluster-weighted disjoint factor analyzers (MCWDFA) model. An alternating expectation-conditional maximization algorithm is employed for parameter estimation. The effectiveness of the proposed model is assessed through an extensive sim- ulation study that examines various scenarios. The proposal is applied to crime data from the United States, sourced from the UCI Machine Learning Repository, with the aim of capturing potential latent heterogeneity within communities and identifying groups of socio- economic features that are similarly associated with factors predicting crime rates. Results provide valuable insights into the underlying structures influencing crime rates, which may potentially be helpful for effective cluster-specific policymaking and social interventions.| File | Dimensione | Formato | |
|---|---|---|---|
|
Qin_extending-cluster-weighted_2025.pdf
solo gestori archivio
Tipologia:
Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
3.31 MB
Formato
Adobe PDF
|
3.31 MB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


