Catalogo dei prodotti della ricerca

Feature selection methods are used in machine learning and data analysis to select a subset of features that may be successfully used in the construction of a model for the data. These methods are applied under the assumption that often many of the available features are redundant for the purpose of the analysis. In this paper, we focus on a particular method for feature selection in supervised learning problems, based on a linear programming model with integer variables. For the solution of the optimization problem associated with this approach, we propose a novel robust metaheuristics algorithm that relies on a Greedy Randomized Adaptive Search Procedure, extended with the adoption of short memory and a local search strategy. The performances of our heuristic algorithm are successfully compared with those of well-established feature selection methods, both on simulated and real data from biological applications. The obtained results suggest that our method is particularly suited for problems with a very large number of binary or categorical features.

Integer programming models for feature selection: new extensions and a randomized solution algorithm / Bertolazzi, P.; Felici, Giovanni; Festa, P.; Fiscon, Giulia; Weitschek, E.. - In: EUROPEAN JOURNAL OF OPERATIONAL RESEARCH. - ISSN 0377-2217. - STAMPA. - (2016), pp. 389-399. [10.1016/j.ejor.2015.09.051]

Integer programming models for feature selection: new extensions and a randomized solution algorithm

Bertolazzi, P.;FELICI, GIOVANNI;Festa, P.;FISCON, GIULIA;Weitschek, E.

2016

Abstract

Feature selection methods are used in machine learning and data analysis to select a subset of features that may be successfully used in the construction of a model for the data. These methods are applied under the assumption that often many of the available features are redundant for the purpose of the analysis. In this paper, we focus on a particular method for feature selection in supervised learning problems, based on a linear programming model with integer variables. For the solution of the optimization problem associated with this approach, we propose a novel robust metaheuristics algorithm that relies on a Greedy Randomized Adaptive Search Procedure, extended with the adoption of short memory and a local search strategy. The performances of our heuristic algorithm are successfully compared with those of well-established feature selection methods, both on simulated and real data from biological applications. The obtained results suggest that our method is particularly suited for problems with a very large number of binary or categorical features.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2016
			
	Parole chiave
	
				Data mining; HeuristicsInteger; programming
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Integer programming models for feature selection: new extensions and a randomized solution algorithm / Bertolazzi, P.; Felici, Giovanni; Festa, P.; Fiscon, Giulia; Weitschek, E.. - In: EUROPEAN JOURNAL OF OPERATIONAL RESEARCH. - ISSN 0377-2217. - STAMPA. - (2016), pp. 389-399. [10.1016/j.ejor.2015.09.051]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Fiscon_Integer_2016.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 502.95 kB Formato Adobe PDF Contatta l'autore	502.95 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/835681

Citazioni

ND

52

48

social impact