Catalogo dei prodotti della ricerca

Recent joint embedding-based self-supervised methods have surpassed standard supervised approaches on various image recognition tasks such as image classification. These self-supervised methods aim at maximizing agreement between features extracted from two differently transformed views of the same image, which results in learning an invariant representation with respect to appearance and geometric image transformations. However, the effectiveness of these approaches remains unclear in the context of gaze estimation, a structured regression task that requires equivariance under geometric transformations (e.g., rotations, horizontal flip). In this work, we propose SwAT, an equivariant version of the online clustering-based self-supervised approach SwAV, to learn more informative representations for gaze estimation. We demonstrate that SwAT, with ResNet-50 and supported with uncurated unlabeled face images, outperforms state-of-the-art gaze estimation methods and supervised baselines in various experiments. In particular, we achieve up to 57% and 25% improvements in cross-dataset and within-dataset evaluation tasks on existing benchmarks (ETH-XGaze, Gaze360, and MPIIFaceGaze).

Towards Self-Supervised Gaze Estimation / Farkhondeh, A.; Palmero, C.; Scardapane, S.; Escalera, S.. - (2022), pp. 1-13. (Intervento presentato al convegno 33rd British Machine Vision Conference Proceedings, BMVC 2022 tenutosi a London; UK).

Towards Self-Supervised Gaze Estimation

Farkhondeh A.;Palmero C.;Scardapane S.;Escalera S.

2022

Abstract

Recent joint embedding-based self-supervised methods have surpassed standard supervised approaches on various image recognition tasks such as image classification. These self-supervised methods aim at maximizing agreement between features extracted from two differently transformed views of the same image, which results in learning an invariant representation with respect to appearance and geometric image transformations. However, the effectiveness of these approaches remains unclear in the context of gaze estimation, a structured regression task that requires equivariance under geometric transformations (e.g., rotations, horizontal flip). In this work, we propose SwAT, an equivariant version of the online clustering-based self-supervised approach SwAV, to learn more informative representations for gaze estimation. We demonstrate that SwAT, with ResNet-50 and supported with uncurated unlabeled face images, outperforms state-of-the-art gaze estimation methods and supervised baselines in various experiments. In particular, we achieve up to 57% and 25% improvements in cross-dataset and within-dataset evaluation tasks on existing benchmarks (ETH-XGaze, Gaze360, and MPIIFaceGaze).

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2022
			
	Nome convegno
	
				33rd British Machine Vision Conference Proceedings, BMVC 2022
			
	Parole chiave
	
				self-supervised; gaze estimation; image recognition
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Towards Self-Supervised Gaze Estimation / Farkhondeh, A.; Palmero, C.; Scardapane, S.; Escalera, S.. - (2022), pp. 1-13. (Intervento presentato al  convegno 33rd British Machine Vision Conference Proceedings, BMVC 2022 tenutosi a London; UK).
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Farkhondeh_Towards_2022.pdf accesso aperto Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.15 MB Formato Adobe PDF	1.15 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1702060

Citazioni

ND

2

ND

social impact