Often, analyzing administrative data we have a large number of units and variables and many missing observations. Sometimes it is necessary to merge large data sets, in which only some variables are in common. The correct approach to handle these situations depends on the type of data and the purpose of the analysis. However we can not simply delete the incomplete records, because it amounts to a substantial loss of costly collected data. In this paper we compare different approaches that can be implemented using specific software. Single imputation, multiple imputation and non-parametric methods will be considered with an application to a European statistical survey.
Handling missing data in large data sets / DI CIACCIO, Agostino. - STAMPA. - (2014), pp. 69-69. (Intervento presentato al convegno CONFERENCE OF EUROPEAN STATISTICS STAKEHOLDERS. CESS2014 tenutosi a Roma nel 24-25 November 2014).
Handling missing data in large data sets
DI CIACCIO, AGOSTINO
2014
Abstract
Often, analyzing administrative data we have a large number of units and variables and many missing observations. Sometimes it is necessary to merge large data sets, in which only some variables are in common. The correct approach to handle these situations depends on the type of data and the purpose of the analysis. However we can not simply delete the incomplete records, because it amounts to a substantial loss of costly collected data. In this paper we compare different approaches that can be implemented using specific software. Single imputation, multiple imputation and non-parametric methods will be considered with an application to a European statistical survey.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.