This paper presents DIS-PIPE, a software tool that leverages well-established process mining techniques to tackle the Data Pipeline Discovery (DPD) task. Data pipelines are composite steps that move data from disparate sources to some data consumers. While data travels through the pipeline, it can undergo various transformations processed by computational platforms. In this context, DPD targets learning the structure and behavior of a data pipeline from an event log that keeps track of its past executions, uncovering, to some extent, specific execution-related dark data whose knowledge is critical to improving the quality of pipeline modeling. DIS-PIPE has been designed, implemented, and validated in the H2020 European project DataCloud context, and is able to interpret XES logs enriched with information to capture the core concepts of data pipelines.

DIS-PIPE: A Tool for Data Pipeline Discovery / Agostinelli, S.; Benvenuti, D.; Marrella, A.; Rossi, J.. - 3783:(2024). (Intervento presentato al convegno International Conference on Process Mining tenutosi a Copenhagen; Denmark).

DIS-PIPE: A Tool for Data Pipeline Discovery

Agostinelli S.;Benvenuti D.;Marrella A.;Rossi J.
2024

Abstract

This paper presents DIS-PIPE, a software tool that leverages well-established process mining techniques to tackle the Data Pipeline Discovery (DPD) task. Data pipelines are composite steps that move data from disparate sources to some data consumers. While data travels through the pipeline, it can undergo various transformations processed by computational platforms. In this context, DPD targets learning the structure and behavior of a data pipeline from an event log that keeps track of its past executions, uncovering, to some extent, specific execution-related dark data whose knowledge is critical to improving the quality of pipeline modeling. DIS-PIPE has been designed, implemented, and validated in the H2020 European project DataCloud context, and is able to interpret XES logs enriched with information to capture the core concepts of data pipelines.
2024
International Conference on Process Mining
Dark Data; Data Pipeline; Data Pipeline Discovery (DPD); DataCloud; Event Log; Process Mining; XES
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
DIS-PIPE: A Tool for Data Pipeline Discovery / Agostinelli, S.; Benvenuti, D.; Marrella, A.; Rossi, J.. - 3783:(2024). (Intervento presentato al convegno International Conference on Process Mining tenutosi a Copenhagen; Denmark).
File allegati a questo prodotto
File Dimensione Formato  
Agostinelli_DIS-PIPE_2024.pdf

accesso aperto

Note: https://ceur-ws.org/Vol-3783/paper_354.pdf
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 318.05 kB
Formato Adobe PDF
318.05 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1724598
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact