The combined use of data from different sources is an opportunity that the National Statistical Institutes exploit more and more frequently. In a context where huge amount of information, produced by different actors, can be integrated and compared, it becomes even more necessary to provide quality assessments of methods and techniques that have allowed to achieve integration results. When considering data integration at the micro level, record linkage procedures are widely used and generally produce good results (when strong identifying variables are available), although rarely are these procedures provided with associated quality indicators. However, especially in official statistics, quality indicators need to be used in subsequent statistical analyses to guarantee and assess data accuracy and reliability. This paper proposes a method for linkage error estimation. The method enriches the Fellegi and Sunter model for probabilistic record linkage: as well known, the Fellegi and Sunter decision rule is very effective for link identification but generally less reliable for result evaluation. The proposal aims at predicting the linkage quality in the Fellegi and Sunter framework, introducing a supervised step.

New proposal for linkage error estimation / Tuoto, Tiziana. - In: STATISTICAL JOURNAL OF THE IAOS. - ISSN 1874-7655. - 32:(2016), pp. 413-420.

New proposal for linkage error estimation

Tiziana Tuoto
2016

Abstract

The combined use of data from different sources is an opportunity that the National Statistical Institutes exploit more and more frequently. In a context where huge amount of information, produced by different actors, can be integrated and compared, it becomes even more necessary to provide quality assessments of methods and techniques that have allowed to achieve integration results. When considering data integration at the micro level, record linkage procedures are widely used and generally produce good results (when strong identifying variables are available), although rarely are these procedures provided with associated quality indicators. However, especially in official statistics, quality indicators need to be used in subsequent statistical analyses to guarantee and assess data accuracy and reliability. This paper proposes a method for linkage error estimation. The method enriches the Fellegi and Sunter model for probabilistic record linkage: as well known, the Fellegi and Sunter decision rule is very effective for link identification but generally less reliable for result evaluation. The proposal aims at predicting the linkage quality in the Fellegi and Sunter framework, introducing a supervised step.
2016
Probabilistic record linkage; linkage errors; linkage quality assessment
01 Pubblicazione su rivista::01a Articolo in rivista
New proposal for linkage error estimation / Tuoto, Tiziana. - In: STATISTICAL JOURNAL OF THE IAOS. - ISSN 1874-7655. - 32:(2016), pp. 413-420.
File allegati a questo prodotto
File Dimensione Formato  
Tuoto_New-proposal_2016.pdf

accesso aperto

Tipologia: Documento in Post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 769.61 kB
Formato Adobe PDF
769.61 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1608068
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? ND
social impact