With big administrative data, often we have a large number of variables with different measurement levels and many missing data. The correct approach to handle these situations depends on the type of data and the purpose of analysis. However, we can not simply delete the incomplete records, because it amounts to a substantial loss of costly collected data. Single imputation or multiple imputation can be applied to obtain different aims, create an ‘imputed’ data matrix with the same characteristics of the observed data or take account, in the estimation of a model, of the additional variability due to the imputation process. For big administrative data, several approaches have been proposed in literature. In this paper we compare different approaches, considering both single and multiple imputation, and we propose a new method, named Multitree. By some simulations, we show that Multitree is competitive with the best methods considered in literature.

Missing data imputation by Multitree / DI CIACCIO, Agostino. - ELETTRONICO. - 1:(2015), pp. 103-104. (Intervento presentato al convegno 2015 IFCS Conference tenutosi a Bologna nel 6-8 Luglio).

Missing data imputation by Multitree

DI CIACCIO, AGOSTINO
2015

Abstract

With big administrative data, often we have a large number of variables with different measurement levels and many missing data. The correct approach to handle these situations depends on the type of data and the purpose of analysis. However, we can not simply delete the incomplete records, because it amounts to a substantial loss of costly collected data. Single imputation or multiple imputation can be applied to obtain different aims, create an ‘imputed’ data matrix with the same characteristics of the observed data or take account, in the estimation of a model, of the additional variability due to the imputation process. For big administrative data, several approaches have been proposed in literature. In this paper we compare different approaches, considering both single and multiple imputation, and we propose a new method, named Multitree. By some simulations, we show that Multitree is competitive with the best methods considered in literature.
2015
2015 IFCS Conference
04 Pubblicazione in atti di convegno::04d Abstract in atti di convegno
Missing data imputation by Multitree / DI CIACCIO, Agostino. - ELETTRONICO. - 1:(2015), pp. 103-104. (Intervento presentato al convegno 2015 IFCS Conference tenutosi a Bologna nel 6-8 Luglio).
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/816629
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact