Among the goals of statistical matching, a very important one is the estimation of the joint distribution of variables not jointly observed in a sample survey but separately available from independent sample surveys. The absence of joint information on the variables of interest leads to uncertainty about the data generating model since the available sample information is unable to discriminate among a set of plausible joint distributions. In the present paper a short review of the concept of uncertainty in statistical matching under logical constraints, as well as how to measure uncertainty for continuous variables is presented. The notion of matching error is related to an appropriate measure of uncertainty and a criterion of selecting matching variables by choosing the variables minimizing such an uncertainty measure is introduced. Finally, a method to choose a plausible joint distribution for the variables of interest via iterative proportional fitting algorithm is described. The proposed methodology is then applied to household income and expenditure data when extra sample information regarding the average propensity to consume is available. This leads to a reconstructed complete dataset where each record includes measures on income and expenditure.
Statistical matching and uncertainty analysis in combining household income and expenditure data / Conti, Pier Luigi; Marella, Daniela; Neri, Andrea. - In: STATISTICAL METHODS & APPLICATIONS. - ISSN 1618-2510. - STAMPA. - 26:(2017), pp. 485-505. [10.1007/s10260-016-0374-7]
Statistical matching and uncertainty analysis in combining household income and expenditure data
Pier Luigi Conti;Daniela Marella;
2017
Abstract
Among the goals of statistical matching, a very important one is the estimation of the joint distribution of variables not jointly observed in a sample survey but separately available from independent sample surveys. The absence of joint information on the variables of interest leads to uncertainty about the data generating model since the available sample information is unable to discriminate among a set of plausible joint distributions. In the present paper a short review of the concept of uncertainty in statistical matching under logical constraints, as well as how to measure uncertainty for continuous variables is presented. The notion of matching error is related to an appropriate measure of uncertainty and a criterion of selecting matching variables by choosing the variables minimizing such an uncertainty measure is introduced. Finally, a method to choose a plausible joint distribution for the variables of interest via iterative proportional fitting algorithm is described. The proposed methodology is then applied to household income and expenditure data when extra sample information regarding the average propensity to consume is available. This leads to a reconstructed complete dataset where each record includes measures on income and expenditure.File | Dimensione | Formato | |
---|---|---|---|
Conti_Statistical-matching_2017.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
1.14 MB
Formato
Adobe PDF
|
1.14 MB | Adobe PDF | Contatta l'autore |
Conti_Statistical-matching_2017.pdf
accesso aperto
Note: Versione del lavoro accettata per la pubblicazione
Tipologia:
Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
292.51 kB
Formato
Adobe PDF
|
292.51 kB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.