: In data analysis, how to select meaningful variables is a hot and wide-debated topic, and several variable selection (or feature reduction) approaches have been proposed in the literature. Although feature selection methods are numerous, most of them are suitable for data matrices, but not for higher order structures. This is mainly due to the fact the assessment of the relevancy of variables in a multi-way context has not been extensively discussed. To the best of our knowledge, among variable selection approaches developed for standard 2-way data arrays, only VIP analysis and selectivity ratio have been extended to higher-order structures. This aspect is not given by an irrelevance of the topic; on the contrary, the possibility of selecting information in a complex data set such as a multi-way structure is crucial. In the light of these considerations, the present paper discusses a feature selection strategy for N-way data based on the Covariance Selection (CovSel) approach, thus called N-CovSel. This method allows the selection of features of different dimensionality (from 1- up to (N-1)-way), depending on the nature of the original data array. The novel method has been applied on a simulated data set, in order to inspect its ability in selecting features compatible with the ground truth of the system, and on a real data set. In both cases, N-CovSel has demonstrated to be able to select meaningful features. Eventually, different strategies for the further analysis of the selected features have been proposed; some, based on sequential multi-block methods, providing a further data reduction, and some, N-PLS-based, respecting the multi-way nature of the data.

N-CovSel, a new strategy for feature selection in N-way data / Biancolillo, Alessandra; Roger, Jean-Michel; Marini, Federico. - In: ANALYTICA CHIMICA ACTA. - ISSN 1873-4324. - 1231:(2022), pp. 1-13. [10.1016/j.aca.2022.340433]

N-CovSel, a new strategy for feature selection in N-way data

Marini, Federico
Ultimo
2022

Abstract

: In data analysis, how to select meaningful variables is a hot and wide-debated topic, and several variable selection (or feature reduction) approaches have been proposed in the literature. Although feature selection methods are numerous, most of them are suitable for data matrices, but not for higher order structures. This is mainly due to the fact the assessment of the relevancy of variables in a multi-way context has not been extensively discussed. To the best of our knowledge, among variable selection approaches developed for standard 2-way data arrays, only VIP analysis and selectivity ratio have been extended to higher-order structures. This aspect is not given by an irrelevance of the topic; on the contrary, the possibility of selecting information in a complex data set such as a multi-way structure is crucial. In the light of these considerations, the present paper discusses a feature selection strategy for N-way data based on the Covariance Selection (CovSel) approach, thus called N-CovSel. This method allows the selection of features of different dimensionality (from 1- up to (N-1)-way), depending on the nature of the original data array. The novel method has been applied on a simulated data set, in order to inspect its ability in selecting features compatible with the ground truth of the system, and on a real data set. In both cases, N-CovSel has demonstrated to be able to select meaningful features. Eventually, different strategies for the further analysis of the selected features have been proposed; some, based on sequential multi-block methods, providing a further data reduction, and some, N-PLS-based, respecting the multi-way nature of the data.
2022
Chemometrics; Covariance selection (CovSel); Multi-linear partial least squares regression (N-PLS); Multi-way data; Sequential and orthogonalized partial least squares (SO-PLS); Variable selection
01 Pubblicazione su rivista::01a Articolo in rivista
N-CovSel, a new strategy for feature selection in N-way data / Biancolillo, Alessandra; Roger, Jean-Michel; Marini, Federico. - In: ANALYTICA CHIMICA ACTA. - ISSN 1873-4324. - 1231:(2022), pp. 1-13. [10.1016/j.aca.2022.340433]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1677497
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact