In releasing data arising from sample surveys, statistical Agencies must face the obligation to protect the confidentiality of respondents. Prior to any data release, disclosure risk assessment is necessary. Disclosure may occur because ill-intentioned users might exploit their own information to link records in the re- leased data to target individuals by matching on common characteristics (keys) that permit identification. Following the literature, we focus on categorical key variables, which is typical for socio-demographic surveys. Intuitively, individu- als who are unique or rare in the population with respect to the key variables are at high risk of disclosure. Indeed the number of observations that are unique in a sample and also unique, or rare, in the population is commonly used to measure the overall risk of disclosure in the sample data. Many authors have attempted to estimate risk by employing parametric models on cross classifications of the keys, i.e. multi-way contingen

Bayesian semiparametric disclosure risk estimation via mixed effects log-linear models / Polettini, Silvia; Cinzia, Carota; Maurizio, Filippone; Roberto, Leombruni. - (2014). (Intervento presentato al convegno Frontiers of Hierarchical Modeling in Observational Studies, Complex Surveys and Big Data: A Conference Honoring Professor Malay Ghosh tenutosi a Joint Program in Survey Methodology (JPSM) University of Maryland College Park, Maryland, USA nel May 28–31, 2014).

Bayesian semiparametric disclosure risk estimation via mixed effects log-linear models

POLETTINI, SILVIA;
2014

Abstract

In releasing data arising from sample surveys, statistical Agencies must face the obligation to protect the confidentiality of respondents. Prior to any data release, disclosure risk assessment is necessary. Disclosure may occur because ill-intentioned users might exploit their own information to link records in the re- leased data to target individuals by matching on common characteristics (keys) that permit identification. Following the literature, we focus on categorical key variables, which is typical for socio-demographic surveys. Intuitively, individu- als who are unique or rare in the population with respect to the key variables are at high risk of disclosure. Indeed the number of observations that are unique in a sample and also unique, or rare, in the population is commonly used to measure the overall risk of disclosure in the sample data. Many authors have attempted to estimate risk by employing parametric models on cross classifications of the keys, i.e. multi-way contingen
2014
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/647618
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact