In the release of microdata files, reidentification of a record implies disclosure of the values of a possibly large set of sensitive variables. When microdata files are released by statistical Agencies, a careful assessment of the associated disclosure risk is therefore required. In order for an informed decision to be made, maximising accuracy and precision of the risk estimators is crucial. Clearly such characteristics will affect the risk assessment process and Agencies should choose the estimator that performs best. In fact, estimators may perform poorly, especially for those records whose real risk is higher. To improve estimation, we propose to introduce external information, arising from a previous census as is done in the context of small area estimation. We previously considered SPREE - type estimators that use the association structure observed at a previous census; in this paper we consider models that use the structure of a population contingency table while allowing for smooth variation of the latter. To assess the statistical properties of this estimator and compare it with alternative approaches, we show results of a simulation study that is based on a complex sampling scheme, typical of most households surveys in Italy. Comparison is made with a simple SPREE estimator and a Skinner-type estimator, applied to a complex sampling scheme.
Use of Auxiliary Information in Risk Estimation / L., Di Consiglio; Polettini, Silvia. - STAMPA. - 5262(2008), pp. 213-226. - LECTURE NOTES IN COMPUTER SCIENCE. [10.1007/978-3-540-87471-3_18].
Use of Auxiliary Information in Risk Estimation
POLETTINI, SILVIA
2008
Abstract
In the release of microdata files, reidentification of a record implies disclosure of the values of a possibly large set of sensitive variables. When microdata files are released by statistical Agencies, a careful assessment of the associated disclosure risk is therefore required. In order for an informed decision to be made, maximising accuracy and precision of the risk estimators is crucial. Clearly such characteristics will affect the risk assessment process and Agencies should choose the estimator that performs best. In fact, estimators may perform poorly, especially for those records whose real risk is higher. To improve estimation, we propose to introduce external information, arising from a previous census as is done in the context of small area estimation. We previously considered SPREE - type estimators that use the association structure observed at a previous census; in this paper we consider models that use the structure of a population contingency table while allowing for smooth variation of the latter. To assess the statistical properties of this estimator and compare it with alternative approaches, we show results of a simulation study that is based on a complex sampling scheme, typical of most households surveys in Italy. Comparison is made with a simple SPREE estimator and a Skinner-type estimator, applied to a complex sampling scheme.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.