Catalogo dei prodotti della ricerca

We study whether a large language model can reliably evaluate human creativity in constrained, innovation-like tasks. Using expert-generated creative outputs from a validated experiment with workers in cultural and creative industries, we embed ChatGPT as an evaluator and benchmark its assessments against expert human judgments obtained through the Consensual Assessment Technique. Study 1 supports AI reliability by showing that AI-based creativity evaluations exhibit internal consistency comparable to that of expert judges across repeated and independent runs, even under conservative scenarios. Replacing a human judge with an AI evaluator does not reduce inter-rater reliability across drawing, mathematical, and verbal tasks. Beyond reliability, AI evaluations display three additional features that are difficult to achieve with human-only panels: lower evaluative variability, systematically higher scores consistent with a potentially more inclusive evaluative stance, and task-independence of evaluative standards. Study 2 further supports task-independence by showing that AI evaluations are structured along fluency, flexibility, originality, and elaboration, with dimension weights that adapt to task-specific constraints.

Evaluating creative work with artificial intelligence. Evidence from constrained innovation tasks / Addis, Valerio Fedele; Attanasi, Giuseppe; Di Bartolomeo, Giovanni; Mariella, Michele; Peruzzi, Valentina. - In: TECHNOVATION. - ISSN 0166-4972. - 155:(2026), pp. -1. [10.1016/j.technovation.2026.103571]

Evaluating creative work with artificial intelligence. Evidence from constrained innovation tasks

Valerio Fedele Addis;Giuseppe Attanasi;Giovanni Di Bartolomeo;Michele Mariella;Valentina Peruzzi

2026

Abstract

We study whether a large language model can reliably evaluate human creativity in constrained, innovation-like tasks. Using expert-generated creative outputs from a validated experiment with workers in cultural and creative industries, we embed ChatGPT as an evaluator and benchmark its assessments against expert human judgments obtained through the Consensual Assessment Technique. Study 1 supports AI reliability by showing that AI-based creativity evaluations exhibit internal consistency comparable to that of expert judges across repeated and independent runs, even under conservative scenarios. Replacing a human judge with an AI evaluator does not reduce inter-rater reliability across drawing, mathematical, and verbal tasks. Beyond reliability, AI evaluations display three additional features that are difficult to achieve with human-only panels: lower evaluative variability, systematically higher scores consistent with a potentially more inclusive evaluative stance, and task-independence of evaluative standards. Study 2 further supports task-independence by showing that AI evaluations are structured along fluency, flexibility, originality, and elaboration, with dimension weights that adapt to task-specific constraints.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2026
			
	Parole chiave
	
				artificial intelligence; creativity evaluation; constrained creativity tasks; consensual assessment technique; cultural and creative industry professionals; innovation-like tasks
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Evaluating creative work with artificial intelligence. Evidence from constrained innovation tasks / Addis, Valerio Fedele; Attanasi, Giuseppe; Di Bartolomeo, Giovanni; Mariella, Michele; Peruzzi, Valentina. - In: TECHNOVATION. - ISSN 0166-4972. - 155:(2026), pp. -1. [10.1016/j.technovation.2026.103571]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Addis_Evaluating_2026.pdf accesso aperto Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Creative commons Dimensione 1.43 MB Formato Adobe PDF	1.43 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1767943

Citazioni

ND

ND

ND

social impact