Ontology-Based Data Preparation in Healthcare: The Case of the AMD-STITCH Project

Croce, Federico; Valentini, Riccardo; Maranghi, Marianna; Grani, Giorgio; Lenzerini, Maurizio; Rosati, Riccardo

doi:10.1007/s42979-024-02757-w

In the context of healthcare, an AI solution is generally developed for a specific analysis task, based on a relevant dataset, with little attention to reusability and generalizability of its data preparation step. This paper focuses on a different scenario, which can be called context-oriented, where a set of clinical data sources, relevant for a specific context (e.g., a particular disease), is available and can be used for a variety of data analytics tasks, often carried out by different research groups. Therefore, the aim of this research is to present a systematic method, which exploits the Ontology-based Data Management paradigm to enhance data preparation in a context-oriented scenario. The introduced methodology has been applied to a project dealing with big data and regarding the treatment of diabetes and its complications. The peculiarity and challenge of this project lies in the fact that it deals with real world data, extracted from Electronic Medical Records within a 13 years timeframe, and thus not collected for research purposes. The paper focuses on two main steps of data preparation, namely data modeling and data cleaning, and it shows how this approach provides effective techniques for setting up a unified and shared database, to be used in the subsequent data analytics phases as an asset.

Ontology-Based Data Preparation in Healthcare: The Case of the AMD-STITCH Project / Croce, Federico; Valentini, Riccardo; Maranghi, Marianna; Grani, Giorgio; Lenzerini, Maurizio; Rosati, Riccardo. - In: SN COMPUTER SCIENCE. - ISSN 2661-8907. - 5:4(2024). [10.1007/s42979-024-02757-w]

Ontology-Based Data Preparation in Healthcare: The Case of the AMD-STITCH Project

FEDERICO CROCE^Co-primo;Riccardo Valentini^Co-primo;Marianna Maranghi;Giorgio Grani;Maurizio Lenzerini^Penultimo;Riccardo Rosati^Ultimo

2024

Abstract

In the context of healthcare, an AI solution is generally developed for a specific analysis task, based on a relevant dataset, with little attention to reusability and generalizability of its data preparation step. This paper focuses on a different scenario, which can be called context-oriented, where a set of clinical data sources, relevant for a specific context (e.g., a particular disease), is available and can be used for a variety of data analytics tasks, often carried out by different research groups. Therefore, the aim of this research is to present a systematic method, which exploits the Ontology-based Data Management paradigm to enhance data preparation in a context-oriented scenario. The introduced methodology has been applied to a project dealing with big data and regarding the treatment of diabetes and its complications. The peculiarity and challenge of this project lies in the fact that it deals with real world data, extracted from Electronic Medical Records within a 13 years timeframe, and thus not collected for research purposes. The paper focuses on two main steps of data preparation, namely data modeling and data cleaning, and it shows how this approach provides effective techniques for setting up a unified and shared database, to be used in the subsequent data analytics phases as an asset.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2024
			
	Parole chiave
	
				Ontology; data preparation; real-world healthcare
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Ontology-Based Data Preparation in Healthcare: The Case of the AMD-STITCH Project / Croce, Federico; Valentini, Riccardo; Maranghi, Marianna; Grani, Giorgio; Lenzerini, Maurizio; Rosati, Riccardo. - In: SN COMPUTER SCIENCE. - ISSN 2661-8907. - 5:4(2024). [10.1007/s42979-024-02757-w]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Croce_Ontology-based_2024.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 901.53 kB Formato Adobe PDF Contatta l'autore	901.53 kB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1719621

Citazioni

ND

6

ND

Catalogo dei prodotti della ricerca