Leveraging Inter-Rater Agreement for Classification in the Presence of Noisy Labels

Bucarelli, MARIA SOFIA; Cassano, Lucas; Siciliano, Federico; Mantrach, Amin; Silvestri, Fabrizio

doi:10.1109/CVPR52729.2023.00335

In practical settings, classification datasets are obtained through a labelling process that is usually done by humans. Labels can be noisy as they are obtained by aggregating the different individual labels assigned to the same sample by multiple and possibly disagreeing, annotators. The inter-rater agreement on these datasets can be measured while the underlying noise distribution to which the labels are subject is assumed to be unknown. In this work, we: (i) show how to leverage the inter-annotator statistics to estimate the noise distribution to which labels are subject; (ii) introduce methods that use the estimate of the noise distribution to learn from the noisy dataset; and (iii) establish generalization bounds in the empirical risk minimization framework that depend on the estimated quantities. We conclude the paper by providing experiments that illustrate our findings.

Leveraging Inter-Rater Agreement for Classification in the Presence of Noisy Labels / Bucarelli, MARIA SOFIA; Cassano, Lucas; Siciliano, Federico; Mantrach, Amin; Silvestri, Fabrizio. - (2023), pp. 3439-3448. (Intervento presentato al convegno IEEE Conference on Computer Vision and Pattern Recognition tenutosi a Vancouver; Canada) [10.1109/CVPR52729.2023.00335].

Leveraging Inter-Rater Agreement for Classification in the Presence of Noisy Labels

Maria Sofia Bucarelli^Primo;Lucas Cassano;Federico Siciliano;Amin Mantrach;Fabrizio Silvestri^Ultimo

2023

Abstract

In practical settings, classification datasets are obtained through a labelling process that is usually done by humans. Labels can be noisy as they are obtained by aggregating the different individual labels assigned to the same sample by multiple and possibly disagreeing, annotators. The inter-rater agreement on these datasets can be measured while the underlying noise distribution to which the labels are subject is assumed to be unknown. In this work, we: (i) show how to leverage the inter-annotator statistics to estimate the noise distribution to which labels are subject; (ii) introduce methods that use the estimate of the noise distribution to learn from the noisy dataset; and (iii) establish generalization bounds in the empirical risk minimization framework that depend on the estimated quantities. We conclude the paper by providing experiments that illustrate our findings.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
			2023
		
	Nome convegno
	
			IEEE Conference on Computer Vision and Pattern Recognition
		
	Parole chiave
	
			Deep Learning; Noisy Labels; Supervised Learning
		
	Tipologia
	
			04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
		
	Citazione
	
			Leveraging Inter-Rater Agreement for Classification in the Presence of Noisy Labels / Bucarelli, MARIA SOFIA; Cassano, Lucas; Siciliano, Federico; Mantrach, Amin; Silvestri, Fabrizio. - (2023), pp. 3439-3448. (Intervento presentato al  convegno IEEE Conference on Computer Vision and Pattern Recognition tenutosi a Vancouver; Canada) [10.1109/CVPR52729.2023.00335].

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1685080

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

1

Catalogo dei prodotti della ricerca