In practical settings, classification datasets are often labeled by humans, leading to potential noise due to varying annotations from different individuals. The exact noise distribution impacting these labels is typically unknown; however, one quantity we can measure and attempt to exploit is inter-rater agreement. Building on this, our work makes key contributions: we (i) demonstrate how inter-annotator statistics can be used to estimate the label noise distribution; (ii) propose methods that leverage these estimates to train models on noisy data; and (iii) derive generalization bounds within the empirical risk minimization framework that depend on the estimated noise characteristics. Finally, we present experiments that support our findings.

When Annotators Disagree: A Principled Approach to Learning with Noisy Labels / Bucarelli, Maria Sofia; Purificato, Antonio; Bacciu, Andrea; Cassano, Lucas; Siciliano, Federico; Nelakanti, Anil; Mantrach, Amin; Silvestri, Fabrizio. - In: IEEE TRANSACTIONS ON ARTIFICIAL INTELLIGENCE. - ISSN 2691-4581. - (2026), pp. 1-16. [10.1109/tai.2026.3666527]

When Annotators Disagree: A Principled Approach to Learning with Noisy Labels

Bucarelli, Maria Sofia
Primo
Methodology
;
Purificato, Antonio
Secondo
Methodology
;
Bacciu, Andrea
Software
;
Siciliano, Federico
Resources
;
Mantrach, Amin
Penultimo
Writing – Review & Editing
;
Silvestri, Fabrizio
Ultimo
Project Administration
2026

Abstract

In practical settings, classification datasets are often labeled by humans, leading to potential noise due to varying annotations from different individuals. The exact noise distribution impacting these labels is typically unknown; however, one quantity we can measure and attempt to exploit is inter-rater agreement. Building on this, our work makes key contributions: we (i) demonstrate how inter-annotator statistics can be used to estimate the label noise distribution; (ii) propose methods that leverage these estimates to train models on noisy data; and (iii) derive generalization bounds within the empirical risk minimization framework that depend on the estimated noise characteristics. Finally, we present experiments that support our findings.
2026
Crowdsourcing; Noisy Labels; Supervised Learning
01 Pubblicazione su rivista::01a Articolo in rivista
When Annotators Disagree: A Principled Approach to Learning with Noisy Labels / Bucarelli, Maria Sofia; Purificato, Antonio; Bacciu, Andrea; Cassano, Lucas; Siciliano, Federico; Nelakanti, Anil; Mantrach, Amin; Silvestri, Fabrizio. - In: IEEE TRANSACTIONS ON ARTIFICIAL INTELLIGENCE. - ISSN 2691-4581. - (2026), pp. 1-16. [10.1109/tai.2026.3666527]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1761896
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact