Catalogo dei prodotti della ricerca

Missing data imputation (MDI) is the task of replacing missing values in a dataset with alternative, predicted ones. Because of the widespread presence of missing data, it is a fundamental problem in many scientific disciplines. Popular methods for MDI use global statistics computed from the entire dataset (e.g., the feature-wise medians), or build predictive models operating independently on every instance. In this paper we propose a more general framework for MDI, leveraging recent work in the field of graph neural networks (GNNs). We formulate the MDI task in terms of a graph denoising autoencoder, where each edge of the graph encodes the similarity between two patterns. A GNN encoder learns to build intermediate representations for each example by interleaving classical projection layers and locally combining information between neighbors, while another decoding GNN learns to reconstruct the full imputed dataset from this intermediate embedding. In order to speed-up training and improve the performance, we use a combination of multiple losses, including an adversarial loss implemented with the Wasserstein metric and a gradient penalty. We also explore a few extensions to the basic architecture involving the use of residual connections between layers, and of global statistics computed from the dataset to improve the accuracy. On a large experimental evaluation with varying levels of artificial noise, we show that our method is on par or better than several alternative imputation methods. On three datasets with pre-existing missing values, we show that our method is robust to the choice of a downstream classifier, obtaining similar or slightly higher results compared to other choices.

Missing data imputation with adversarially-trained graph convolutional networks / Spinelli, I; Scardapane, S; Uncini, A. - In: NEURAL NETWORKS. - ISSN 0893-6080. - 129:(2020), pp. 249-260. [https://doi.org/10.1016/j.neunet.2020.06.005]

Missing data imputation with adversarially-trained graph convolutional networks

Spinelli I;Scardapane S;Uncini A

2020

Abstract

Missing data imputation (MDI) is the task of replacing missing values in a dataset with alternative, predicted ones. Because of the widespread presence of missing data, it is a fundamental problem in many scientific disciplines. Popular methods for MDI use global statistics computed from the entire dataset (e.g., the feature-wise medians), or build predictive models operating independently on every instance. In this paper we propose a more general framework for MDI, leveraging recent work in the field of graph neural networks (GNNs). We formulate the MDI task in terms of a graph denoising autoencoder, where each edge of the graph encodes the similarity between two patterns. A GNN encoder learns to build intermediate representations for each example by interleaving classical projection layers and locally combining information between neighbors, while another decoding GNN learns to reconstruct the full imputed dataset from this intermediate embedding. In order to speed-up training and improve the performance, we use a combination of multiple losses, including an adversarial loss implemented with the Wasserstein metric and a gradient penalty. We also explore a few extensions to the basic architecture involving the use of residual connections between layers, and of global statistics computed from the dataset to improve the accuracy. On a large experimental evaluation with varying levels of artificial noise, we show that our method is on par or better than several alternative imputation methods. On three datasets with pre-existing missing values, we show that our method is robust to the choice of a downstream classifier, obtaining similar or slightly higher results compared to other choices.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2020
			
	Parole chiave
	
				imputation; graph neural network; graph data; convolutional network
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Missing data imputation with adversarially-trained graph convolutional networks / Spinelli, I; Scardapane, S; Uncini, A. - In: NEURAL NETWORKS. - ISSN 0893-6080. - 129:(2020), pp. 249-260. [https://doi.org/10.1016/j.neunet.2020.06.005]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Scardapane_preprint_Missing_2020.pdf solo gestori archivio Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 3.12 MB Formato Adobe PDF Contatta l'autore	3.12 MB	Adobe PDF	Contatta l'autore
Spinelli_Missing_2020.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.14 MB Formato Adobe PDF Contatta l'autore	1.14 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1486185

Citazioni

ND

82

61

social impact