The estimation of population parameters using complex survey data requires careful statistical modelling to account for the design features. This is further complicated by unit and item nonresponse for which a number of methods have been developed in order to reduce estimation bias. In this paper, we address some issues that arise when the target of the inference (i.e. the analysis model or model of interest) is the conditional quantile of a continuous outcome. Survey design variables are duly included in the analysis and a bootstrap variance estimation approach is proposed. Missing data are multiply imputed by means of chained equations. In particular, imputation of continuous variables is based on their empirical distribution, conditional on all other variables in the analysis. This method preserves the distributional relationships in the data, including conditional skewness and kurtosis, and successfully handles bounded outcomes. Our motivating study concerns the analysis of birthweight determinants in a large UK-based cohort of children. A novel finding on the parental conflict theory is reported. R code implementing these procedures is provided.

Estimation of regression quantiles in complex surveys with data missing at random: An application to birthweight determinants / Geraci, Marco. - In: STATISTICAL METHODS IN MEDICAL RESEARCH. - ISSN 0962-2802. - 25:4(2016), pp. 1393-1421.

Estimation of regression quantiles in complex surveys with data missing at random: An application to birthweight determinants

GERACI Marco
2016

Abstract

The estimation of population parameters using complex survey data requires careful statistical modelling to account for the design features. This is further complicated by unit and item nonresponse for which a number of methods have been developed in order to reduce estimation bias. In this paper, we address some issues that arise when the target of the inference (i.e. the analysis model or model of interest) is the conditional quantile of a continuous outcome. Survey design variables are duly included in the analysis and a bootstrap variance estimation approach is proposed. Missing data are multiply imputed by means of chained equations. In particular, imputation of continuous variables is based on their empirical distribution, conditional on all other variables in the analysis. This method preserves the distributional relationships in the data, including conditional skewness and kurtosis, and successfully handles bounded outcomes. Our motivating study concerns the analysis of birthweight determinants in a large UK-based cohort of children. A novel finding on the parental conflict theory is reported. R code implementing these procedures is provided.
2016
chained equations; Khmaladze tests; multiple imputation; paediatrics; weights
01 Pubblicazione su rivista::01a Articolo in rivista
Estimation of regression quantiles in complex surveys with data missing at random: An application to birthweight determinants / Geraci, Marco. - In: STATISTICAL METHODS IN MEDICAL RESEARCH. - ISSN 0962-2802. - 25:4(2016), pp. 1393-1421.
File allegati a questo prodotto
File Dimensione Formato  
Geraci_Estimation-regression-quantiles_2016.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.29 MB
Formato Adobe PDF
1.29 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1463950
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 29
  • ???jsp.display-item.citation.isi??? 22
social impact