A major concern in releasing microdata sets is protecting the privacy of individuals in the sample. Consider a data set in the form of a high-dimensional con- tingency table. If an individual belongs to a cell with small frequency, an intruder with certain knowledge about the individual may identify him and learn sensitive information about him in the data. To estimate the risk of such breach of confidentiality we introduce several nonparametric models which represent progressive extensions of the one adopted by Skinner and Holmes (1998). The latter is a Poisson model with rates modeled through a mixed effects log-linear model with normal random effects. In the first extension, we assume Dirichlet process random effects and, mimicking Skinner and Holmes (1998), we keep the fixed effects constant. Next, we relax the latter assumption and consider a model all effects of which are unknown. In both extended models the total mass parameter of the Dirichlet process is also unknown. The MCMC methods used for inference are extensively discussed. An application to real data concludes the article.

Disclosure risk estimation via nonparametric log-linear models / Cinzia, Carota; Maurizio, Filippone; Roberto, Leombruni; Polettini, Silvia. - ELETTRONICO. - (2012). (Intervento presentato al convegno XLVI Scientific Meeting of the Italian Statistical Society tenutosi a Roma nel 20-22 giugno 2012).

Disclosure risk estimation via nonparametric log-linear models

POLETTINI, SILVIA
2012

Abstract

A major concern in releasing microdata sets is protecting the privacy of individuals in the sample. Consider a data set in the form of a high-dimensional con- tingency table. If an individual belongs to a cell with small frequency, an intruder with certain knowledge about the individual may identify him and learn sensitive information about him in the data. To estimate the risk of such breach of confidentiality we introduce several nonparametric models which represent progressive extensions of the one adopted by Skinner and Holmes (1998). The latter is a Poisson model with rates modeled through a mixed effects log-linear model with normal random effects. In the first extension, we assume Dirichlet process random effects and, mimicking Skinner and Holmes (1998), we keep the fixed effects constant. Next, we relax the latter assumption and consider a model all effects of which are unknown. In both extended models the total mass parameter of the Dirichlet process is also unknown. The MCMC methods used for inference are extensively discussed. An application to real data concludes the article.
2012
9788861298828
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/482340
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact