Catalogo dei prodotti della ricerca

Basal cell carcinoma (BCC) is the most common skin cancer. Off-the-shelf multimodal large language models are widely accessible, yet their performance for BCC remains unclear. The aim of this study was to assess BCC detection (BCC vs non-BCC) and BCC subtype classification from clinical and dermoscopic images using 3 web-based large language models (ChatGPT-5, Gemini 2.5 Flash, Claude Sonnet 4). We evaluated 772 images: 402 from 290 histopathology-confirmed BCCs (290 clinical, 112 dermoscopic) and 370 from an independent BCC-mimicker cohort (250 clinical, 120 dermoscopic). Standardized prompts were used. Primary outcome was BCC detection accuracy; secondary outcomes were subtype-classification accuracy and performance by lesion features. For clinical images, ChatGPT-5 achieved the highest detection accuracy (75%), followed by Claude (64.3%) and Gemini (50.7%). For dermoscopy, Claude performed best (69.8%), compared with ChatGPT-5 (55.2%) and Gemini (50.9%). Accuracy was lower in crusted and flat lesions and higher in exophytic lesions; pigmentation effects were model dependent. Subtype-classification accuracy was modest across models. Images were primarily from European centers with limited skin-type diversity; several subgroups were small. Current web-based large language models are not clinically suitable for BCC detection or subtyping. Dermatology-specific training, transparent reporting, and rigorous prospective validation are required before any clinical use.

ChatGPT, Gemini, and Claude in clinical and dermoscopic image analysis of basal cell carcinoma and its common mimickers: A comparative performance analysis / Boostani, M; Zouboulis, Cc; Pellacani, G; Navarrete-Dechent, C; Boussingault, L; Kiss, T; Goldfarb, N; Cantisani, C; Nádudvari, N; Bánvölgyi, A; Wikonkál, Nm; Suppa, M; Paragh, G; Kiss, N.. - In: JID INNOVATIONS. - ISSN 2667-0267. - (2026).

ChatGPT, Gemini, and Claude in clinical and dermoscopic image analysis of basal cell carcinoma and its common mimickers: A comparative performance analysis

Boostani M;Zouboulis CC;Pellacani G;Navarrete-Dechent C;Boussingault L;Kiss T;Goldfarb N;Cantisani C;Nádudvari N;Bánvölgyi A;Wikonkál NM;Suppa M;Paragh G;Kiss N.

2026

Abstract

Basal cell carcinoma (BCC) is the most common skin cancer. Off-the-shelf multimodal large language models are widely accessible, yet their performance for BCC remains unclear. The aim of this study was to assess BCC detection (BCC vs non-BCC) and BCC subtype classification from clinical and dermoscopic images using 3 web-based large language models (ChatGPT-5, Gemini 2.5 Flash, Claude Sonnet 4). We evaluated 772 images: 402 from 290 histopathology-confirmed BCCs (290 clinical, 112 dermoscopic) and 370 from an independent BCC-mimicker cohort (250 clinical, 120 dermoscopic). Standardized prompts were used. Primary outcome was BCC detection accuracy; secondary outcomes were subtype-classification accuracy and performance by lesion features. For clinical images, ChatGPT-5 achieved the highest detection accuracy (75%), followed by Claude (64.3%) and Gemini (50.7%). For dermoscopy, Claude performed best (69.8%), compared with ChatGPT-5 (55.2%) and Gemini (50.9%). Accuracy was lower in crusted and flat lesions and higher in exophytic lesions; pigmentation effects were model dependent. Subtype-classification accuracy was modest across models. Images were primarily from European centers with limited skin-type diversity; several subgroups were small. Current web-based large language models are not clinically suitable for BCC detection or subtyping. Dermatology-specific training, transparent reporting, and rigorous prospective validation are required before any clinical use.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2026
			
	Parole chiave
	
				Artificial intelligence, Basal cell carcinoma, ChatGPT, Gemini, Large language model
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				ChatGPT, Gemini, and Claude in clinical and dermoscopic image analysis of basal cell carcinoma and its common mimickers: A comparative performance analysis / Boostani, M; Zouboulis, Cc; Pellacani, G; Navarrete-Dechent, C; Boussingault, L; Kiss, T; Goldfarb, N; Cantisani, C; Nádudvari, N; Bánvölgyi, A; Wikonkál, Nm; Suppa, M; Paragh, G; Kiss, N.. - In: JID INNOVATIONS. - ISSN 2667-0267. - (2026).

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1767282

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

social impact