In multivariate regression problems with many explanatory variables, researchers often face a trade- off between predictive accuracy and interpretability. While penalized and dimension-reduction methods effectively handle multicollinearity and overparameterization, they typically produce latent components that are difficult to interpret. In this paper, we introduce the Multivariate Regression Model based on Latent Predictors (MRMoLP), a semi-supervised approach designed to enhance interpretability without abandoning predictive perfor- mance. The key idea is to construct latent predictors as linear combinations of disjoint groups of explana- tory variables that share similar predictive behavior toward the responses. Unlike reflective approaches that rely on correlation structures, MRMoLP adopts a formative perspective, grouping potentially hetero- geneous and weakly correlated variables solely according to their contribution to prediction. Each latent predictor is therefore defined by a specific subset of covariates, yielding a clear semantic interpretation and uncovering meaningful macro-constructs with predictive power. Model parameters are estimated via maximum likelihood using an Expectation Conditional Maxi- mization (ECM) algorithm. Through extensive simulations, we show that MRMoLP accurately recovers the underlying grouping structure and achieves competitive predictive performance compared to estab- lished methods such as PLSR, while providing substantially improved interpretability. An application to real-world well-being data illustrates how the method identifies coherent predictor groups and delivers results that are both statistically sound and substantively meaningful.
Multivariate regression model on latent predictors / Martella, Francesca; Vicari, Donatella. - In: ADVANCES IN DATA ANALYSIS AND CLASSIFICATION. - ISSN 1862-5347. - (2026), pp. 1-39.
Multivariate regression model on latent predictors
Francesca Martella
;Donatella Vicari
2026
Abstract
In multivariate regression problems with many explanatory variables, researchers often face a trade- off between predictive accuracy and interpretability. While penalized and dimension-reduction methods effectively handle multicollinearity and overparameterization, they typically produce latent components that are difficult to interpret. In this paper, we introduce the Multivariate Regression Model based on Latent Predictors (MRMoLP), a semi-supervised approach designed to enhance interpretability without abandoning predictive perfor- mance. The key idea is to construct latent predictors as linear combinations of disjoint groups of explana- tory variables that share similar predictive behavior toward the responses. Unlike reflective approaches that rely on correlation structures, MRMoLP adopts a formative perspective, grouping potentially hetero- geneous and weakly correlated variables solely according to their contribution to prediction. Each latent predictor is therefore defined by a specific subset of covariates, yielding a clear semantic interpretation and uncovering meaningful macro-constructs with predictive power. Model parameters are estimated via maximum likelihood using an Expectation Conditional Maxi- mization (ECM) algorithm. Through extensive simulations, we show that MRMoLP accurately recovers the underlying grouping structure and achieves competitive predictive performance compared to estab- lished methods such as PLSR, while providing substantially improved interpretability. An application to real-world well-being data illustrates how the method identifies coherent predictor groups and delivers results that are both statistically sound and substantively meaningful.| File | Dimensione | Formato | |
|---|---|---|---|
|
Martella_MRMoLP_2026.pdf
solo gestori archivio
Tipologia:
Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.54 MB
Formato
Adobe PDF
|
1.54 MB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


