In recent decades, the widespread use of information systems has led to an increasing amount of sensitive data managed, posing several privacy and security risks. Controlled Query Evaluation (CQE) is a framework for limiting access to sensitive information when querying a system, while still providing useful responses. Initially proposed for databases, CQE has recently been studied for Description Logics ontologies, the logical formalization underpinning most Semantic Web technologies. Central to CQE is the possibility of specifying a data protection policy in a declarative way, avoiding the need to actively alter data to ensure their security. The policy is enforced via optimal censors, which aim to minimally alter information while maintaining confidentiality. Notably, censors satisfying the so-called indistinguishability property also ensure that the end user is not even able to discern if the answers to a query have been altered or not. The thesis investigates various CQE semantics based on the so-called optimal GA censors, which enjoy the indistinguishability property. In order not to arbitrarly pick a censor, we adopt the usual strategy of skeptical reasoning, i.e. answering according to all the optimal censors. Unfortunately, this approach is proven to be intractable, so we present an alternative semantics based on the intersection of all optimal GA censors (called IGA censor). Under this semantics, answering union of conjunctive queries (UCQs) in the Description Logic DL-LiteR is first-order (FO) rewritable, and thus in AC0 in data complexity. Anyway, the intersection-based approach may cause losing too much information, which led us to explore smarter semantics for improving the throughput of query answers while preserving confidentiality. Towards this aim, we first show how to exploit a priority relation between ontology predicates for reducing the number of censors. Then, we consider the strategy of selecting censors dynamically, exploiting the order in which queries are posed for maximizing the cooperativity of the system. In both cases, we provide suitable FO rewriting techniques, proving that the nice computational properties of the IGA semantics are preserved. The prioritized scenario also served as main setting for carrying out our experiments, wherein we implemented the newly-presented CQE techniques adapting them to the ontology-based data access methodology. Moreover, the research extends to the related field of Consistent Query Answering (CQA), which studies how to handle inconsistencies when answering queries. Specifically, we consider knowledge bases in which a set of existential rules must always be satisfied by the underlying database. A central notion in CQA is the one of repair, that is a maximal subset of the database satisfying all the rules. As for query answering, similarly as done in CQE, we examine both skeptical reasoning and the intersection-based semantics, known as AR and IAR, respectively. We study a very expressive language of rules and identify many subclasses where repair checking and UCQ entailment are tractable, or even FO rewritable. We finally investigate the integration between open and closed-world assumption, offering insights into utilizing open and closed predicates within the CQA framework, and provide the first set of complexity results for the aforenamed decision problems within this intriguing scenario.

Privacy-preserving and inconsistency-tolerant query answering in knowledge bases and databases / Marconi, Lorenzo. - (2024 May 07).

Privacy-preserving and inconsistency-tolerant query answering in knowledge bases and databases

MARCONI, LORENZO
07/05/2024

Abstract

In recent decades, the widespread use of information systems has led to an increasing amount of sensitive data managed, posing several privacy and security risks. Controlled Query Evaluation (CQE) is a framework for limiting access to sensitive information when querying a system, while still providing useful responses. Initially proposed for databases, CQE has recently been studied for Description Logics ontologies, the logical formalization underpinning most Semantic Web technologies. Central to CQE is the possibility of specifying a data protection policy in a declarative way, avoiding the need to actively alter data to ensure their security. The policy is enforced via optimal censors, which aim to minimally alter information while maintaining confidentiality. Notably, censors satisfying the so-called indistinguishability property also ensure that the end user is not even able to discern if the answers to a query have been altered or not. The thesis investigates various CQE semantics based on the so-called optimal GA censors, which enjoy the indistinguishability property. In order not to arbitrarly pick a censor, we adopt the usual strategy of skeptical reasoning, i.e. answering according to all the optimal censors. Unfortunately, this approach is proven to be intractable, so we present an alternative semantics based on the intersection of all optimal GA censors (called IGA censor). Under this semantics, answering union of conjunctive queries (UCQs) in the Description Logic DL-LiteR is first-order (FO) rewritable, and thus in AC0 in data complexity. Anyway, the intersection-based approach may cause losing too much information, which led us to explore smarter semantics for improving the throughput of query answers while preserving confidentiality. Towards this aim, we first show how to exploit a priority relation between ontology predicates for reducing the number of censors. Then, we consider the strategy of selecting censors dynamically, exploiting the order in which queries are posed for maximizing the cooperativity of the system. In both cases, we provide suitable FO rewriting techniques, proving that the nice computational properties of the IGA semantics are preserved. The prioritized scenario also served as main setting for carrying out our experiments, wherein we implemented the newly-presented CQE techniques adapting them to the ontology-based data access methodology. Moreover, the research extends to the related field of Consistent Query Answering (CQA), which studies how to handle inconsistencies when answering queries. Specifically, we consider knowledge bases in which a set of existential rules must always be satisfied by the underlying database. A central notion in CQA is the one of repair, that is a maximal subset of the database satisfying all the rules. As for query answering, similarly as done in CQE, we examine both skeptical reasoning and the intersection-based semantics, known as AR and IAR, respectively. We study a very expressive language of rules and identify many subclasses where repair checking and UCQ entailment are tractable, or even FO rewritable. We finally investigate the integration between open and closed-world assumption, offering insights into utilizing open and closed predicates within the CQA framework, and provide the first set of complexity results for the aforenamed decision problems within this intriguing scenario.
7-mag-2024
File allegati a questo prodotto
File Dimensione Formato  
Tesi_dottorato_Marconi.pdf

embargo fino al 07/05/2025

Tipologia: Tesi di dottorato
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.33 MB
Formato Adobe PDF
1.33 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1709595
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact