The quality of metadata plays a crucial role in many data FAIRification processes. So much so, in fact, that all the four main principles of data FAIRification prescribe the use of high-quality metadata. One of the main data management paradigms where metadata is a first-class citizen is Ontology-Based Data Management (OBDM). The goal of OBDM is to provide users with a reconciled view of a set of heterogeneous data sources by means of a semantic metadata layer comprising an ontology and a mapping. The former is a high-level, declarative representation of the domain of interest written in terms of a logical theory, and the latter is a formal description of the relation between the symbols in the ontology and the data at the sources. In this article, we introduce a novel data quality framework based on OBDM and specifically tailored for metadata analysis. The target of this framework is one of the most common forms of metadata currently in circulation, i.e., the integrity constraints defined by a database schema. Specifically, we will focus on the data quality dimension known as Consistency, i.e., the property of data that is free of contradictions and incoherence. In this context, our techniques provide a set of tools to compare the integrity constraints defined by a database schema against the knowledge encoded in an ontology and check whether these constraints are strict enough (i.e., protect) and are not too strict (i.e., are faithful to) for such knowledge. The contribution of the article is the presentation of the framework and the study of the related computational problems. We will present a detailed computational complexity analysis of such problems and show that they are decidable for classes of OBDM specifications and integrity constraints that are very popular in practice.

Ontology-Based Schema-Level Data Quality: The Case of Consistency / Cima, G.; Console, M.; Lenzerini, M.. - In: ACM JOURNAL OF DATA AND INFORMATION QUALITY. - ISSN 1936-1955. - 17:4(2025), pp. 1-25. [10.1145/3770750]

Ontology-Based Schema-Level Data Quality: The Case of Consistency

Cima G.;Console M.;Lenzerini M.
2025

Abstract

The quality of metadata plays a crucial role in many data FAIRification processes. So much so, in fact, that all the four main principles of data FAIRification prescribe the use of high-quality metadata. One of the main data management paradigms where metadata is a first-class citizen is Ontology-Based Data Management (OBDM). The goal of OBDM is to provide users with a reconciled view of a set of heterogeneous data sources by means of a semantic metadata layer comprising an ontology and a mapping. The former is a high-level, declarative representation of the domain of interest written in terms of a logical theory, and the latter is a formal description of the relation between the symbols in the ontology and the data at the sources. In this article, we introduce a novel data quality framework based on OBDM and specifically tailored for metadata analysis. The target of this framework is one of the most common forms of metadata currently in circulation, i.e., the integrity constraints defined by a database schema. Specifically, we will focus on the data quality dimension known as Consistency, i.e., the property of data that is free of contradictions and incoherence. In this context, our techniques provide a set of tools to compare the integrity constraints defined by a database schema against the knowledge encoded in an ontology and check whether these constraints are strict enough (i.e., protect) and are not too strict (i.e., are faithful to) for such knowledge. The contribution of the article is the presentation of the framework and the study of the related computational problems. We will present a detailed computational complexity analysis of such problems and show that they are decidable for classes of OBDM specifications and integrity constraints that are very popular in practice.
2025
computational complexity; data dependencies; description logics; global consistency; integrity constraints; local consistency; Ontology-based data management
01 Pubblicazione su rivista::01a Articolo in rivista
Ontology-Based Schema-Level Data Quality: The Case of Consistency / Cima, G.; Console, M.; Lenzerini, M.. - In: ACM JOURNAL OF DATA AND INFORMATION QUALITY. - ISSN 1936-1955. - 17:4(2025), pp. 1-25. [10.1145/3770750]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1764437
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact