In recent years there has been a lot of interest in designing principled classification algorithms over multiple cues, based on the intuitive notion that using more features should lead to better performance. In the domain of kernel methods, a principled way to use multiple features is the Multi Kernel Learning (MKL) approach. Here we present a MKL optimization algorithm based on stochastic gradient descent that has a guaranteed convergence rate. We directly solve the MKL problem in the primal formulation. By having a p-norm formulation of MKL, we introduce a parameter that controls the level of sparsity of the solution, while leading to an easier optimization problem. We prove theoretically and experimentally that 1) our algorithm has a faster convergence rate as the number of kernels grows; 2) the training complexity is linear in the number of training examples; 3) very few iterations are sufficient to reach good solutions. Experiments on standard benchmark databases support our claims. © 2012 Francesco Orabona, Luo Jie and Barbara Caputo.

Multi Kernel Learning with online-batch optimization / Orabona, Francesco; Jie, Luo; Caputo, Barbara. - In: JOURNAL OF MACHINE LEARNING RESEARCH. - ISSN 1532-4435. - STAMPA. - 13:(2012), pp. 227-253.

Multi Kernel Learning with online-batch optimization

CAPUTO, BARBARA
2012

Abstract

In recent years there has been a lot of interest in designing principled classification algorithms over multiple cues, based on the intuitive notion that using more features should lead to better performance. In the domain of kernel methods, a principled way to use multiple features is the Multi Kernel Learning (MKL) approach. Here we present a MKL optimization algorithm based on stochastic gradient descent that has a guaranteed convergence rate. We directly solve the MKL problem in the primal formulation. By having a p-norm formulation of MKL, we introduce a parameter that controls the level of sparsity of the solution, while leading to an easier optimization problem. We prove theoretically and experimentally that 1) our algorithm has a faster convergence rate as the number of kernels grows; 2) the training complexity is linear in the number of training examples; 3) very few iterations are sufficient to reach good solutions. Experiments on standard benchmark databases support our claims. © 2012 Francesco Orabona, Luo Jie and Barbara Caputo.
2012
Convergence bounds; Large scale; Learning kernels; Multiple Kernel Learning; Online optimization; Stochastic subgradient descent; Control and Systems Engineering; Software; Statistics and Probability; Artificial Intelligence
01 Pubblicazione su rivista::01a Articolo in rivista
Multi Kernel Learning with online-batch optimization / Orabona, Francesco; Jie, Luo; Caputo, Barbara. - In: JOURNAL OF MACHINE LEARNING RESEARCH. - ISSN 1532-4435. - STAMPA. - 13:(2012), pp. 227-253.
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/951712
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 43
  • ???jsp.display-item.citation.isi??? 40
social impact