Recently, data augmentation in the semi-supervised regime, where unlabeled data vastly outnumbers labeled data, has received a considerable attention. In this paper, we describe an efficient technique for this task, exploiting a recent framework we proposed for missing data imputation called graph imputation neural network (GINN). The key idea is to leverage both supervised and unsupervised data to build a graph of similarities between points in the dataset. Then, we augment the dataset by severely damaging a few of the nodes (up to 80% of their features), and reconstructing them using a variation of GINN. On several benchmark datasets, we show that our method can obtain significant improvements compared to a fully-supervised model, and we are able to augment the datasets up to a factor of. This points to the power of graph-based neural networks to represent structural affinities in the samples for tasks of data reconstruction and augmentation.
Efficient data augmentation using graph imputation neural networks / Spinelli, I.; Scardapane, S.; Scarpiniti, M.; Uncini, A.. - (2021), pp. 57-66. - SMART INNOVATION, SYSTEMS AND TECHNOLOGIES. [10.1007/978-981-15-5093-5_6].
Efficient data augmentation using graph imputation neural networks
Spinelli I.
;Scardapane S.;Scarpiniti M.;Uncini A.
2021
Abstract
Recently, data augmentation in the semi-supervised regime, where unlabeled data vastly outnumbers labeled data, has received a considerable attention. In this paper, we describe an efficient technique for this task, exploiting a recent framework we proposed for missing data imputation called graph imputation neural network (GINN). The key idea is to leverage both supervised and unsupervised data to build a graph of similarities between points in the dataset. Then, we augment the dataset by severely damaging a few of the nodes (up to 80% of their features), and reconstructing them using a variation of GINN. On several benchmark datasets, we show that our method can obtain significant improvements compared to a fully-supervised model, and we are able to augment the datasets up to a factor of. This points to the power of graph-based neural networks to represent structural affinities in the samples for tasks of data reconstruction and augmentation.File | Dimensione | Formato | |
---|---|---|---|
Spinelli_post-print_Efficient-data_2021.pdf
solo gestori archivio
Tipologia:
Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
266.93 kB
Formato
Adobe PDF
|
266.93 kB | Adobe PDF | Contatta l'autore |
Spinelli_Efficient-data_2021.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
257.18 kB
Formato
Adobe PDF
|
257.18 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.