Catalogo dei prodotti della ricerca

A procedure based on the use of Sequential Regression and Classification Trees for the imputation of missing data is proposed. We dealt with the case of non-monotone patterns of missing data with mixed measurement level of the variables. The aim of the analysis is to obtain a completed data matrix with optimal characteristics (with respect to means, variances and correlations of the variables) which is often the main demand for a statistical office. Moreover we want to obtain a measure of the additional variability due to the presence of missing values. A simulation case with qualitative and quantitative data is analyzed and the results compared with other procedures. In particular, the performance of Multiple Imputation, using IVEWARE, were compared with our proposal using a large simulation with artificial data and with the EU-SILC cross-sectional data. Our non-parametric method showed to be very competitive on these simulations.

Una nuova procedura di imputazione di dati mancanti basata sugli alberi di decisione / DI CIACCIO, A., Giorgi, G.M.. - In: RIVISTA ITALIANA DI ECONOMIA, DEMOGRAFIA E STATISTICA. - ISSN 0035-6832. - STAMPA. - 66 n.1:(2012), pp. 149-156.

Una nuova procedura di imputazione di dati mancanti basata sugli alberi di decisione

DI CIACCIO, AGOSTINO;GIORGI, Giovanni Maria

2012

Abstract

A procedure based on the use of Sequential Regression and Classification Trees for the imputation of missing data is proposed. We dealt with the case of non-monotone patterns of missing data with mixed measurement level of the variables. The aim of the analysis is to obtain a completed data matrix with optimal characteristics (with respect to means, variances and correlations of the variables) which is often the main demand for a statistical office. Moreover we want to obtain a measure of the additional variability due to the presence of missing values. A simulation case with qualitative and quantitative data is analyzed and the results compared with other procedures. In particular, the performance of Multiple Imputation, using IVEWARE, were compared with our proposal using a large simulation with artificial data and with the EU-SILC cross-sectional data. Our non-parametric method showed to be very competitive on these simulations.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2012
			
	Parole chiave
	
				missing data
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Una nuova procedura di imputazione di dati mancanti basata sugli alberi di decisione / DI CIACCIO, A., Giorgi, G.M.. - In: RIVISTA ITALIANA DI ECONOMIA, DEMOGRAFIA E STATISTICA. - ISSN 0035-6832. - STAMPA. - 66 n.1:(2012), pp. 149-156.
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/443392

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

social impact