The quantity and quality of administrative information available to National Statistical Institutes have been constantly increasing over the past several years. However, different sources of administrative data are not expected to each have the same population coverage, so that estimating the true population size from the collective set of data poses several methodological challenges that set the problem apart from a classical capture-recapture setting. In this article, we consider two specific aspects of this problem: (1) misclassification of the units, leading to lists with both overcoverage and undercoverage; and (2) lists focusing on a specific subpopulation, leaving a proportion of the population with null probability of being captured. We propose an approach to this problem that employs a class of capturerecapture methods based on Latent Class models. We assess the proposed approach via a simulation study, then apply the method to five sources of empirical data to estimate the number of active local units of Italian enterprises in 2011.

Population Size Estimation Using Multiple Incomplete Lists with Overcoverage / Di Cecco, D.; Di Zio, M.; Filipponi, D.; Rocchetti, I.. - In: JOURNAL OF OFFICIAL STATISTICS. - ISSN 0282-423X. - 34:2(2018), pp. 557-572. [10.2478/jos-2018-0026]

Population Size Estimation Using Multiple Incomplete Lists with Overcoverage

Di Cecco D.
;
2018

Abstract

The quantity and quality of administrative information available to National Statistical Institutes have been constantly increasing over the past several years. However, different sources of administrative data are not expected to each have the same population coverage, so that estimating the true population size from the collective set of data poses several methodological challenges that set the problem apart from a classical capture-recapture setting. In this article, we consider two specific aspects of this problem: (1) misclassification of the units, leading to lists with both overcoverage and undercoverage; and (2) lists focusing on a specific subpopulation, leaving a proportion of the population with null probability of being captured. We propose an approach to this problem that employs a class of capturerecapture methods based on Latent Class models. We assess the proposed approach via a simulation study, then apply the method to five sources of empirical data to estimate the number of active local units of Italian enterprises in 2011.
2018
capture-recapture models; latent class models; missing data; multisource integration
01 Pubblicazione su rivista::01a Articolo in rivista
Population Size Estimation Using Multiple Incomplete Lists with Overcoverage / Di Cecco, D.; Di Zio, M.; Filipponi, D.; Rocchetti, I.. - In: JOURNAL OF OFFICIAL STATISTICS. - ISSN 0282-423X. - 34:2(2018), pp. 557-572. [10.2478/jos-2018-0026]
File allegati a questo prodotto
File Dimensione Formato  
DiCecco_Population-Size_2018.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 276 kB
Formato Adobe PDF
276 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1357019
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 12
social impact