Recent developments in the interplay between Operational Research and Statistics allowed us to exploit advances in Mixed-Integer Optimisation (MIO) solvers to improve the quality of statistical analysis. In this work, we tackle Canonical Correlation Analysis (CCA), a dimensionality reduction method that jointly summarises multiple data sources while retaining their dependency structure. We propose a new technique for encoding sparsity in CCA by means of a mathematical programming formulation that allows one to obtain an exact solution using readily available solvers (such as Gurobi) or design solution algorithmic procedures based on it. Finally, we evaluate the performance of alternative solution strategies presented on multiple datasets from the literature. The results of the extensive comparison study highlight that the proposed approach is capable of finding the optimal correlation or finding good quality solutions, better than those provided by other conventional methods.

A Mathematical Programming Approach to Sparse Canonical Correlation Analysis / Amorosi, L.; Padellini, T.; Puerto, J.; Valverde, C.. - In: EXPERT SYSTEMS WITH APPLICATIONS. - ISSN 0957-4174. - 237:(2023), pp. 1-21. [10.1016/j.eswa.2023.121293]

A Mathematical Programming Approach to Sparse Canonical Correlation Analysis

Amorosi L.
;
2023

Abstract

Recent developments in the interplay between Operational Research and Statistics allowed us to exploit advances in Mixed-Integer Optimisation (MIO) solvers to improve the quality of statistical analysis. In this work, we tackle Canonical Correlation Analysis (CCA), a dimensionality reduction method that jointly summarises multiple data sources while retaining their dependency structure. We propose a new technique for encoding sparsity in CCA by means of a mathematical programming formulation that allows one to obtain an exact solution using readily available solvers (such as Gurobi) or design solution algorithmic procedures based on it. Finally, we evaluate the performance of alternative solution strategies presented on multiple datasets from the literature. The results of the extensive comparison study highlight that the proposed approach is capable of finding the optimal correlation or finding good quality solutions, better than those provided by other conventional methods.
2023
data science; canonical correlation analysis; mixed-integer optimisation; sparsity
01 Pubblicazione su rivista::01a Articolo in rivista
A Mathematical Programming Approach to Sparse Canonical Correlation Analysis / Amorosi, L.; Padellini, T.; Puerto, J.; Valverde, C.. - In: EXPERT SYSTEMS WITH APPLICATIONS. - ISSN 0957-4174. - 237:(2023), pp. 1-21. [10.1016/j.eswa.2023.121293]
File allegati a questo prodotto
File Dimensione Formato  
Amorosi_mathematical-programming-approach.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 1.33 MB
Formato Adobe PDF
1.33 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1688934
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact