A Lightweight Model for Accurate Multi-View Hand Pose Recognition

Kadyrzhanov, Artur; Esteban-Romero, Sergio; Gil-Martín, Manuel; Marini, Marco

doi:10.5220/0014235600004052

This paper introduces a lightweight architecture for multi-view hand pose recognition on multimodal fusion of images and landmarks. The proposed model employs a compact Convolutional Neural Network (CNN) to extract visual features from dual-view grayscale images, while a Multi-Layer Perceptron (MLP) processes the corresponding Leap Motion Controller 2 hand landmarks. The two modalities are fused to create an efficient yet discriminative representation. Compared to the Vision Transformer (ViT)+MLP baseline, which achieves an F1 score of 79.33 ± 0.09 % with 8.95 × 107 parameters, our CNN+MLP model reaches a higher recognition accuracy of 85.36 ± 0.08 % while requiring only 2.13 × 105 parameters, which corresponds to an important reduction of the model size. Moreover, a landmarks-only variant using the MLP achieves 85.22 ± 0.08 % accuracy with just 6.46 × 104 parameters. These results, obtained on the Multi-view Leap2 Hand Pose Dataset under a Leave-One-Subject-Out Cross-Validation protocol, demonstrate that accurate multi-view hand pose recognition can be achieved with dramatically fewer parameters, enabling efficient deployment in resource-constrained environments.

A Lightweight Model for Accurate Multi-View Hand Pose Recognition / Kadyrzhanov, Artur; Esteban-Romero, Sergio; Gil-Martín, Manuel; Marini, Marco. - (2026), pp. 2342-2349. ( the 18th International Conference on Agents and Artificial Intelligence Marbella, Spain ) [10.5220/0014235600004052].

A Lightweight Model for Accurate Multi-View Hand Pose Recognition

Kadyrzhanov, Artur;Esteban-Romero, Sergio;Gil-Martín, Manuel;Marini, Marco

2026

Abstract

This paper introduces a lightweight architecture for multi-view hand pose recognition on multimodal fusion of images and landmarks. The proposed model employs a compact Convolutional Neural Network (CNN) to extract visual features from dual-view grayscale images, while a Multi-Layer Perceptron (MLP) processes the corresponding Leap Motion Controller 2 hand landmarks. The two modalities are fused to create an efficient yet discriminative representation. Compared to the Vision Transformer (ViT)+MLP baseline, which achieves an F1 score of 79.33 ± 0.09 % with 8.95 × 107 parameters, our CNN+MLP model reaches a higher recognition accuracy of 85.36 ± 0.08 % while requiring only 2.13 × 105 parameters, which corresponds to an important reduction of the model size. Moreover, a landmarks-only variant using the MLP achieves 85.22 ± 0.08 % accuracy with just 6.46 × 104 parameters. These results, obtained on the Multi-view Leap2 Hand Pose Dataset under a Leave-One-Subject-Out Cross-Validation protocol, demonstrate that accurate multi-view hand pose recognition can be achieved with dramatically fewer parameters, enabling efficient deployment in resource-constrained environments.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2026
			
	Nome convegno
	
				the 18th International Conference on Agents and Artificial Intelligence
			
	Parole chiave
	
				Multi-view Hand Pose Recognition; Lightweight Model; Leap Motion Controller 2; Multimodal data; Multimodal fusion; Deep Learning.
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				A Lightweight Model for Accurate Multi-View Hand Pose Recognition / Kadyrzhanov, Artur; Esteban-Romero, Sergio; Gil-Martín, Manuel; Marini, Marco. - (2026), pp. 2342-2349. ( the 18th International Conference on Agents and Artificial Intelligence Marbella, Spain ) [10.5220/0014235600004052].

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1768221

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

Catalogo dei prodotti della ricerca