In this paper, we study the embedded feature selection problem in linear Support Vector Machines (SVMs), in which a cardinality constraint is employed, leading to a fully explainable selection model. The problem is NP-hard due to the presence of the cardinality constraint, even though the original linear SVM amounts to a problem solvable in polynomial time. To handle the hard problem, we first introduce two mixed-integer formulations for which novel SDP relaxations are proposed. Exploiting the sparsity pattern of the relaxations, we decompose the problems and obtain equivalent relaxations in a much smaller cone, making the conic approaches scalable. To make the best usage of the decomposed relaxations, we propose heuristics using the information of its optimal solution. Moreover, an exact procedure is proposed by solving a sequence of mixed-integer decomposed SDPs. Numerical results on classical benchmarking datasets are reported, showing the efficiency and effectiveness of our approach.

Feature selection in linear SVMs via hard cardinality constraint: a scalable SDP decomposition approach / Bomze, Immanuel; D'Onofrio, Federico; Palagi, Laura; Peng, Bo. - (2024).

Feature selection in linear SVMs via hard cardinality constraint: a scalable SDP decomposition approach

Immanuel Bomze
;
Federico D'Onofrio
;
Laura Palagi
;
2024

Abstract

In this paper, we study the embedded feature selection problem in linear Support Vector Machines (SVMs), in which a cardinality constraint is employed, leading to a fully explainable selection model. The problem is NP-hard due to the presence of the cardinality constraint, even though the original linear SVM amounts to a problem solvable in polynomial time. To handle the hard problem, we first introduce two mixed-integer formulations for which novel SDP relaxations are proposed. Exploiting the sparsity pattern of the relaxations, we decompose the problems and obtain equivalent relaxations in a much smaller cone, making the conic approaches scalable. To make the best usage of the decomposed relaxations, we propose heuristics using the information of its optimal solution. Moreover, an exact procedure is proposed by solving a sequence of mixed-integer decomposed SDPs. Numerical results on classical benchmarking datasets are reported, showing the efficiency and effectiveness of our approach.
2024
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1708982
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact