This work deals with the Support Vector Machine (SVM) learning process which, as it is well-known, can be reduced to the solution of a quadratic problem in the so-called feature space. For very large datasets there are practical issues in the solution of this optimization problem. In particular it may be impossible to store in memory the Hessian of the quadratic form. Decomposition methods, such as the Sequential Minimal Optimization (SMO) method, overcome this limit by operating with small working sets of indices. While SMO methods use working sets of size two, we study SVM learning techniques with high-dimensional working sets. At every step of the training algorithm, we select a working set of cardinality possibly greater than two and we compute a descent direction, in the working set, by defining a Frank-Wolfe type linear sub-problem. We show that the latter sub-problem can be solved analytically and propose a fast procedure for this purpose, the calculation of the solution require only O(n log(n)) operations in the worst case. Two different SVM training algorithms are proposed along this line and an extensive numerical experimentation is carried out. Numerical results show that datasets such that the training algorithm does not take advantage of second order information, can be trained efficiently with first order information algorithms which use high-dimensional working set
DECOMPOSITION ALGORITHMS FOR SVM TRAINING BASED ON THE FRANK-WOLFE METHOD(2013 Nov 18).
DECOMPOSITION ALGORITHMS FOR SVM TRAINING BASED ON THE FRANK-WOLFE METHOD
-
18/11/2013
Abstract
This work deals with the Support Vector Machine (SVM) learning process which, as it is well-known, can be reduced to the solution of a quadratic problem in the so-called feature space. For very large datasets there are practical issues in the solution of this optimization problem. In particular it may be impossible to store in memory the Hessian of the quadratic form. Decomposition methods, such as the Sequential Minimal Optimization (SMO) method, overcome this limit by operating with small working sets of indices. While SMO methods use working sets of size two, we study SVM learning techniques with high-dimensional working sets. At every step of the training algorithm, we select a working set of cardinality possibly greater than two and we compute a descent direction, in the working set, by defining a Frank-Wolfe type linear sub-problem. We show that the latter sub-problem can be solved analytically and propose a fast procedure for this purpose, the calculation of the solution require only O(n log(n)) operations in the worst case. Two different SVM training algorithms are proposed along this line and an extensive numerical experimentation is carried out. Numerical results show that datasets such that the training algorithm does not take advantage of second order information, can be trained efficiently with first order information algorithms which use high-dimensional working setI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.