Catalogo dei prodotti della ricerca

In this paper we propose a new method to deal with missingness in categorical data. The new proposal is a forward imputation procedure and is presented in the context of the Nonlinear Principal Component Analysis, used to obtain indicators from a large dataset. However, this procedure can be easily adopted in other contexts, and when other multivariate techniques are used. We discuss the statistical features of our imputation technique in connection with other treatment methods which are popular among Nonlinear Principal Component Analysis users. The performance of our method is then compared to the other methods through a simulation study which involves the application to a real dataset extracted from the Euro-barometer survey. Missing data are created in the original data matrix and then the comparison is performed in terms of how close the Nonlinear Principal Component Analysis outcomes from missing data treatment methods are to the ones obtained from the original data. The new procedure is seen to provide better results than the other methods under the different conditions considered.

Handling Missing Data in Presence of Categorical Variables: a New Imputation Procedure / Ferrari, P. A.; Barbiero, A.; Manzi, G.. - (2011), pp. 473-480. [10.1007/978-3-642-11363-5_53].

Handling Missing Data in Presence of Categorical Variables: a New Imputation Procedure

P.A. Ferrari;A. Barbiero;G. Manzi

2011

Abstract

In this paper we propose a new method to deal with missingness in categorical data. The new proposal is a forward imputation procedure and is presented in the context of the Nonlinear Principal Component Analysis, used to obtain indicators from a large dataset. However, this procedure can be easily adopted in other contexts, and when other multivariate techniques are used. We discuss the statistical features of our imputation technique in connection with other treatment methods which are popular among Nonlinear Principal Component Analysis users. The performance of our method is then compared to the other methods through a simulation study which involves the application to a real dataset extracted from the Euro-barometer survey. Missing data are created in the original data matrix and then the comparison is performed in terms of how close the Nonlinear Principal Component Analysis outcomes from missing data treatment methods are to the ones obtained from the original data. The new procedure is seen to provide better results than the other methods under the different conditions considered.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2011
			
	Titolo del volume
	
				New Perspectives in Statistical Modeling and Data Analysis - Studies in Classification, Data Analysis, and Knowledge Organization
			
	ISBN
	
				978-3-642-11362-8
			
	Tipologia
	
				02 Pubblicazione su volume::02a Capitolo o Articolo
			
	Citazione
	
				Handling Missing Data in Presence of Categorical Variables: a New Imputation Procedure / Ferrari, P. A.; Barbiero, A.; Manzi, G.. - (2011), pp. 473-480. [10.1007/978-3-642-11363-5_53].

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1727275

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

0

social impact