In multivariate regression problems with many explanatory variables, researchers often face a trade- off between predictive accuracy and interpretability. While penalized and dimension-reduction methods effectively handle multicollinearity and overparameterization, they typically produce latent components that are difficult to interpret. In this paper, we introduce the Multivariate Regression Model based on Latent Predictors (MRMoLP), a semi-supervised approach designed to enhance interpretability without abandoning predictive perfor- mance. The key idea is to construct latent predictors as linear combinations of disjoint groups of explana- tory variables that share similar predictive behavior toward the responses. Unlike reflective approaches that rely on correlation structures, MRMoLP adopts a formative perspective, grouping potentially hetero- geneous and weakly correlated variables solely according to their contribution to prediction. Each latent predictor is therefore defined by a specific subset of covariates, yielding a clear semantic interpretation and uncovering meaningful macro-constructs with predictive power. Model parameters are estimated via maximum likelihood using an Expectation Conditional Maxi- mization (ECM) algorithm. Through extensive simulations, we show that MRMoLP accurately recovers the underlying grouping structure and achieves competitive predictive performance compared to estab- lished methods such as PLSR, while providing substantially improved interpretability. An application to real-world well-being data illustrates how the method identifies coherent predictor groups and delivers results that are both statistically sound and substantively meaningful.

Multivariate regression model on latent predictors / Martella, Francesca; Vicari, Donatella. - In: ADVANCES IN DATA ANALYSIS AND CLASSIFICATION. - ISSN 1862-5347. - (2026), pp. 1-39.

Multivariate regression model on latent predictors

Francesca Martella
;
Donatella Vicari
2026

Abstract

In multivariate regression problems with many explanatory variables, researchers often face a trade- off between predictive accuracy and interpretability. While penalized and dimension-reduction methods effectively handle multicollinearity and overparameterization, they typically produce latent components that are difficult to interpret. In this paper, we introduce the Multivariate Regression Model based on Latent Predictors (MRMoLP), a semi-supervised approach designed to enhance interpretability without abandoning predictive perfor- mance. The key idea is to construct latent predictors as linear combinations of disjoint groups of explana- tory variables that share similar predictive behavior toward the responses. Unlike reflective approaches that rely on correlation structures, MRMoLP adopts a formative perspective, grouping potentially hetero- geneous and weakly correlated variables solely according to their contribution to prediction. Each latent predictor is therefore defined by a specific subset of covariates, yielding a clear semantic interpretation and uncovering meaningful macro-constructs with predictive power. Model parameters are estimated via maximum likelihood using an Expectation Conditional Maxi- mization (ECM) algorithm. Through extensive simulations, we show that MRMoLP accurately recovers the underlying grouping structure and achieves competitive predictive performance compared to estab- lished methods such as PLSR, while providing substantially improved interpretability. An application to real-world well-being data illustrates how the method identifies coherent predictor groups and delivers results that are both statistically sound and substantively meaningful.
2026
multivariate regression; latent predictors; dimensionality reduction methods; clustering; maximum likelihood
01 Pubblicazione su rivista::01a Articolo in rivista
Multivariate regression model on latent predictors / Martella, Francesca; Vicari, Donatella. - In: ADVANCES IN DATA ANALYSIS AND CLASSIFICATION. - ISSN 1862-5347. - (2026), pp. 1-39.
File allegati a questo prodotto
File Dimensione Formato  
Martella_MRMoLP_2026.pdf

solo gestori archivio

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.54 MB
Formato Adobe PDF
1.54 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1767131
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact