A common approach in multi-task learning is to encourage the tasks to share a low dimensional representation. This has led to the popular method of trace norm regularization, which has proved effective in many applications. In this paper, we extend this approach by allowing the tasks to partition into different groups, within which trace norm regularization is separately applied. We propose a continuous bilevel optimization framework to simultaneously identify groups of related tasks and learn a low dimensional representation within each group. Hinging on recent results on the derivative of generalized matrix functions, we devise a smooth approximation of the upper-level objective via a dual forward-backward algorithm with Bregman distances. This allows us to solve the bilevel problem by a gradient-based scheme. Numerical experiments on synthetic and benchmark datasets support the effectiveness of the proposed method.
Unveiling groups of related tasks in multi-task learning / Frecon, J; Salzo, S; Pontil, M. - (2021), pp. 7134-7141. (Intervento presentato al convegno 25th International Conference on Pattern Recognition (ICPR) tenutosi a Milano) [10.1109/ICPR48806.2021.9413274].
Unveiling groups of related tasks in multi-task learning
Salzo S
;
2021
Abstract
A common approach in multi-task learning is to encourage the tasks to share a low dimensional representation. This has led to the popular method of trace norm regularization, which has proved effective in many applications. In this paper, we extend this approach by allowing the tasks to partition into different groups, within which trace norm regularization is separately applied. We propose a continuous bilevel optimization framework to simultaneously identify groups of related tasks and learn a low dimensional representation within each group. Hinging on recent results on the derivative of generalized matrix functions, we devise a smooth approximation of the upper-level objective via a dual forward-backward algorithm with Bregman distances. This allows us to solve the bilevel problem by a gradient-based scheme. Numerical experiments on synthetic and benchmark datasets support the effectiveness of the proposed method.File | Dimensione | Formato | |
---|---|---|---|
Frecon_Unveiling_2021.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.12 MB
Formato
Adobe PDF
|
2.12 MB | Adobe PDF | Contatta l'autore |
Frecon_preprint_Unveiling_2020.pdf
accesso aperto
Note: DOI: 10.1109/ICPR48806.2021.9413274
Tipologia:
Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
2.03 MB
Formato
Adobe PDF
|
2.03 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.