Catalogo dei prodotti della ricerca

When microdata files for research are released, it is possible that external users may attempt to breach confidentiality. For this reason most National Statistical Institutes apply some form of disclosure risk assessment and data protection. Risk assessment first requires a measure of disclosure risk to be defined. In this paper we build on previous work byBenedetti and Franconi (1998) to define a Bayesian hierarchical model for risk estimation. We follow a superpopulation approach similar to Bethlehem et al. (1990) and Rinott (2003). For each combination of values of the key variables we derive the posterior distribution of the population frequency given the observed sample frequency. Knowledge of this posterior distribution enables us to obtain suitable summaries that can be used to estimate the risk of disclosure. One such summary is the mean of the reciprocal of the population frequency or Benedetti-Franconi risk, but we also investigate others such as the mode. We apply our approach to an artificial sample of the Italian 1991 Census data, drawn by means of a widely used sampling scheme. We report on results of this application and document the computational difficulties that we encountered. The risk estimates that we obtain are sensible, but suggest possible improvements and modifications to our methodology. We discuss these together with potential alternative strategies.

A Bayesian Hierarchical Model Approach to Risk Estimation in Statistical Disclosure Limitation / Polettini, Silvia; Julian, Stander. - STAMPA. - 3050(2004), pp. 247-261. - LECTURE NOTES IN COMPUTER SCIENCE. [10.1007/978-3-540-25955-8_19].

A Bayesian Hierarchical Model Approach to Risk Estimation in Statistical Disclosure Limitation

POLETTINI, SILVIA;Julian Stander

2004

Abstract

When microdata files for research are released, it is possible that external users may attempt to breach confidentiality. For this reason most National Statistical Institutes apply some form of disclosure risk assessment and data protection. Risk assessment first requires a measure of disclosure risk to be defined. In this paper we build on previous work byBenedetti and Franconi (1998) to define a Bayesian hierarchical model for risk estimation. We follow a superpopulation approach similar to Bethlehem et al. (1990) and Rinott (2003). For each combination of values of the key variables we derive the posterior distribution of the population frequency given the observed sample frequency. Knowledge of this posterior distribution enables us to obtain suitable summaries that can be used to estimate the risk of disclosure. One such summary is the mean of the reciprocal of the population frequency or Benedetti-Franconi risk, but we also investigate others such as the mode. We apply our approach to an artificial sample of the Italian 1991 Census data, drawn by means of a widely used sampling scheme. We report on results of this application and document the computational difficulties that we encountered. The risk estimates that we obtain are sensible, but suggest possible improvements and modifications to our methodology. We discuss these together with potential alternative strategies.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2004
			
	Titolo del volume
	
				PRIVACY IN STATISTICAL DATABASES
			
	ISBN
	
				9783540221180
9783540259558
			
	Tipologia
	
				02 Pubblicazione su volume::02a Capitolo o Articolo
			
	Citazione
	
				A Bayesian Hierarchical Model Approach to Risk Estimation in Statistical Disclosure Limitation / Polettini, Silvia; Julian, Stander. - STAMPA. - 3050(2004), pp. 247-261. - LECTURE NOTES IN COMPUTER SCIENCE. [10.1007/978-3-540-25955-8_19].
			
	Appartiene alla tipologia:
	
				02a Capitolo o Articolo

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/467225

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

4

5

social impact