Analysing workplace accidents is crucial for improving occupational safety by understanding causes and preventing recurrence. However, the primary challenge in analysing accident narratives lies in the unstructured nature of the text data. This study examines the effectiveness of Large Language Models (LLMs), specifically GPT-4 Turbo, in extracting information from lockout/tagout (LOTO) accident narratives in the Occupational Safety and Health Administration (OSHA) database. It compares the extracted features, namely the degree of fatality, nature of injury, and employee's occupation, with those recorded by OSHA supervisors. Despite occasional misclassifications and hallucinations, GPT-4 Turbo shows significant potential in automating critical information extraction, reducing reliance on human interpretation. Moreover, the model achieved high accuracy rates for each feature. These findings suggest that LLMs can enhance occupational safety data analysis, though improvements in prompt design and verification are recommended for further accuracy.

A comparative analysis for automated information extraction from OSHA Lockout/Tagout accident narratives with Large Language Model / Sabetta, N.; Costantino, F.; Stabile, S.. - In: PROCEDIA COMPUTER SCIENCE. - ISSN 1877-0509. - 253:(2025), pp. 1362-1372. ( 6th International Conference on Industry 4.0 and Smart Manufacturing, ISM 2024 Prague; Czech Republic ) [10.1016/j.procs.2025.01.198].

A comparative analysis for automated information extraction from OSHA Lockout/Tagout accident narratives with Large Language Model

Sabetta N.
;
Costantino F.;
2025

Abstract

Analysing workplace accidents is crucial for improving occupational safety by understanding causes and preventing recurrence. However, the primary challenge in analysing accident narratives lies in the unstructured nature of the text data. This study examines the effectiveness of Large Language Models (LLMs), specifically GPT-4 Turbo, in extracting information from lockout/tagout (LOTO) accident narratives in the Occupational Safety and Health Administration (OSHA) database. It compares the extracted features, namely the degree of fatality, nature of injury, and employee's occupation, with those recorded by OSHA supervisors. Despite occasional misclassifications and hallucinations, GPT-4 Turbo shows significant potential in automating critical information extraction, reducing reliance on human interpretation. Moreover, the model achieved high accuracy rates for each feature. These findings suggest that LLMs can enhance occupational safety data analysis, though improvements in prompt design and verification are recommended for further accuracy.
2025
6th International Conference on Industry 4.0 and Smart Manufacturing, ISM 2024
Generative AI; Health Administration (OSHA); Industrial Safety; LLM; LOTO; Occupational Safety
04 Pubblicazione in atti di convegno::04c Atto di convegno in rivista
A comparative analysis for automated information extraction from OSHA Lockout/Tagout accident narratives with Large Language Model / Sabetta, N.; Costantino, F.; Stabile, S.. - In: PROCEDIA COMPUTER SCIENCE. - ISSN 1877-0509. - 253:(2025), pp. 1362-1372. ( 6th International Conference on Industry 4.0 and Smart Manufacturing, ISM 2024 Prague; Czech Republic ) [10.1016/j.procs.2025.01.198].
File allegati a questo prodotto
File Dimensione Formato  
Sabetta_A comparative-analysis_2025.pdf

accesso aperto

Note: DOI 10.1016/j.procs.2025.01.198
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 972.98 kB
Formato Adobe PDF
972.98 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1738571
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact