Enhancing next activity prediction in process mining with Retrieval-Augmented Generation

Casciani, Angelo; Mario Luca Bernardi,; Cimitile, Marta; Marrella, Andrea

doi:10.1016/j.is.2025.102642

Next activity prediction is one of the main tasks of Predictive Process Monitoring (PPM), enabling organizations to forecast the execution of business processes and respond accordingly. Deep learning models are effective at predictions, but with the price of intensive training and feature engineering, rendering them less generalizable across domains. Large Language Models (LLMs) have been recently suggested as an alternative, but their capabilities in Process Mining tasks are still to be extensively investigated. This work introduces a framework leveraging LLMs and Retrieval-Augmented Generation to enhance their capabilities for predicting next activities. By leveraging sequential information and data attributes from past execution traces, our framework enables LLMs to make more accurate predictions without additional training. We evaluate the approach on a wide range of event logs and compare it with state-of-the-art techniques. Findings show that our framework achieves competitive performance while being more adaptable across domains. Moreover, we assess early prediction capabilities, validate the significance of observed differences through statistical testing, and explore the impact of fine-tuning. Despite these advantages, we also report the framework’s limitations, mainly related to interleaving activity sensitivity and concept drifts. Our findings highlight the potential of retrieval-augmented LLMs in PPM while identifying the need for future research into handling evolving process behaviors and the development of standard benchmarks.

Enhancing next activity prediction in process mining with Retrieval-Augmented Generation / Casciani, Angelo; Luca Bernardi, Mario; Cimitile, Marta; Marrella, Andrea. - In: INFORMATION SYSTEMS. - ISSN 0306-4379. - 137:(2026). [10.1016/j.is.2025.102642]

Enhancing next activity prediction in process mining with Retrieval-Augmented Generation

Angelo Casciani^Primo;Mario Luca Bernardi;Marta Cimitile;Andrea Marrella

2026

Abstract

Next activity prediction is one of the main tasks of Predictive Process Monitoring (PPM), enabling organizations to forecast the execution of business processes and respond accordingly. Deep learning models are effective at predictions, but with the price of intensive training and feature engineering, rendering them less generalizable across domains. Large Language Models (LLMs) have been recently suggested as an alternative, but their capabilities in Process Mining tasks are still to be extensively investigated. This work introduces a framework leveraging LLMs and Retrieval-Augmented Generation to enhance their capabilities for predicting next activities. By leveraging sequential information and data attributes from past execution traces, our framework enables LLMs to make more accurate predictions without additional training. We evaluate the approach on a wide range of event logs and compare it with state-of-the-art techniques. Findings show that our framework achieves competitive performance while being more adaptable across domains. Moreover, we assess early prediction capabilities, validate the significance of observed differences through statistical testing, and explore the impact of fine-tuning. Despite these advantages, we also report the framework’s limitations, mainly related to interleaving activity sensitivity and concept drifts. Our findings highlight the potential of retrieval-augmented LLMs in PPM while identifying the need for future research into handling evolving process behaviors and the development of standard benchmarks.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2026
			
	Parole chiave
	
				Large Language Model; Next activity prediction; Predictive Process Monitoring; Retrieval-Augmented Generation
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Enhancing next activity prediction in process mining with Retrieval-Augmented Generation / Casciani, Angelo; Luca Bernardi, Mario; Cimitile, Marta; Marrella, Andrea. - In: INFORMATION SYSTEMS. - ISSN 0306-4379. - 137:(2026). [10.1016/j.is.2025.102642]

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1764649

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

1

1

Catalogo dei prodotti della ricerca