Basal cell carcinoma (BCC) is the most common skin cancer. Off-the-shelf multimodal large language models are widely accessible, yet their performance for BCC remains unclear. The aim of this study was to assess BCC detection (BCC vs non-BCC) and BCC subtype classification from clinical and dermoscopic images using 3 web-based large language models (ChatGPT-5, Gemini 2.5 Flash, Claude Sonnet 4). We evaluated 772 images: 402 from 290 histopathology-confirmed BCCs (290 clinical, 112 dermoscopic) and 370 from an independent BCC-mimicker cohort (250 clinical, 120 dermoscopic). Standardized prompts were used. Primary outcome was BCC detection accuracy; secondary outcomes were subtype-classification accuracy and performance by lesion features. For clinical images, ChatGPT-5 achieved the highest detection accuracy (75%), followed by Claude (64.3%) and Gemini (50.7%). For dermoscopy, Claude performed best (69.8%), compared with ChatGPT-5 (55.2%) and Gemini (50.9%). Accuracy was lower in crusted and flat lesions and higher in exophytic lesions; pigmentation effects were model dependent. Subtype-classification accuracy was modest across models. Images were primarily from European centers with limited skin-type diversity; several subgroups were small. Current web-based large language models are not clinically suitable for BCC detection or subtyping. Dermatology-specific training, transparent reporting, and rigorous prospective validation are required before any clinical use.

ChatGPT, Gemini, and Claude in clinical and dermoscopic image analysis of basal cell carcinoma and its common mimickers: A comparative performance analysis / Boostani, M; Zouboulis, Cc; Pellacani, G; Navarrete-Dechent, C; Boussingault, L; Kiss, T; Goldfarb, N; Cantisani, C; Nádudvari, N; Bánvölgyi, A; Wikonkál, Nm; Suppa, M; Paragh, G; Kiss, N.. - In: JID INNOVATIONS. - ISSN 2667-0267. - (2026).

ChatGPT, Gemini, and Claude in clinical and dermoscopic image analysis of basal cell carcinoma and its common mimickers: A comparative performance analysis

Pellacani G;Cantisani C;
2026

Abstract

Basal cell carcinoma (BCC) is the most common skin cancer. Off-the-shelf multimodal large language models are widely accessible, yet their performance for BCC remains unclear. The aim of this study was to assess BCC detection (BCC vs non-BCC) and BCC subtype classification from clinical and dermoscopic images using 3 web-based large language models (ChatGPT-5, Gemini 2.5 Flash, Claude Sonnet 4). We evaluated 772 images: 402 from 290 histopathology-confirmed BCCs (290 clinical, 112 dermoscopic) and 370 from an independent BCC-mimicker cohort (250 clinical, 120 dermoscopic). Standardized prompts were used. Primary outcome was BCC detection accuracy; secondary outcomes were subtype-classification accuracy and performance by lesion features. For clinical images, ChatGPT-5 achieved the highest detection accuracy (75%), followed by Claude (64.3%) and Gemini (50.7%). For dermoscopy, Claude performed best (69.8%), compared with ChatGPT-5 (55.2%) and Gemini (50.9%). Accuracy was lower in crusted and flat lesions and higher in exophytic lesions; pigmentation effects were model dependent. Subtype-classification accuracy was modest across models. Images were primarily from European centers with limited skin-type diversity; several subgroups were small. Current web-based large language models are not clinically suitable for BCC detection or subtyping. Dermatology-specific training, transparent reporting, and rigorous prospective validation are required before any clinical use.
2026
Artificial intelligence, Basal cell carcinoma, ChatGPT, Gemini, Large language model
01 Pubblicazione su rivista::01a Articolo in rivista
ChatGPT, Gemini, and Claude in clinical and dermoscopic image analysis of basal cell carcinoma and its common mimickers: A comparative performance analysis / Boostani, M; Zouboulis, Cc; Pellacani, G; Navarrete-Dechent, C; Boussingault, L; Kiss, T; Goldfarb, N; Cantisani, C; Nádudvari, N; Bánvölgyi, A; Wikonkál, Nm; Suppa, M; Paragh, G; Kiss, N.. - In: JID INNOVATIONS. - ISSN 2667-0267. - (2026).
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1767282
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact