Targeted and Automatic Deep Neural Networks Optimization for Edge Computing

Giovannesi, Luca; Proietti Mattia, Gabriele; Beraldi, Roberto

doi:10.1007/978-3-031-57931-8_6

DNNs, commonly employed for complex tasks such as image and language processing, are increasingly sought for deployment on Internet of Things (IoT) devices. These devices operate with constrained resources, including limited computational power, memory, slower processors, and restricted energy requirements. Consequently, optimizing DNN models becomes crucial to minimize memory usage and computational time. However, traditional optimization methods require skilled professionals to manually fine-tune hyperparameters, striking a balance between efficiency and accuracy. This paper introduces an innovative solution for identifying optimal hyperparameters, focusing on the application of pruning, clusterization, and quantization. Initial empirical analyses were conducted to understand the relationships between model size, accuracy, pruning rate, and the number of clusters. Building upon these findings, we developed a framework that proposes two algorithms: one for discovering optimal pruning and the second for determining the optimal number of clusters. Through the adoption of efficient algorithms and the best quantization configuration, our tool integrates an optimization procedure that successfully reduces model size and inference time. The optimized models generated exhibit results comparable to, and in some cases surpass, those of more complex state-of-the-art approaches. The framework successfully optimized ResNet50, reducing the model size by 6.35x with a speedup of 2.91x, while only sacrificing 0.87% of the original accuracy.

Targeted and Automatic Deep Neural Networks Optimization for Edge Computing / Giovannesi, Luca; Proietti Mattia, Gabriele; Beraldi, Roberto. - 203:(2024), pp. 57-68. (Intervento presentato al convegno International Conference on Advanced Information Networking and Applications (was ICOIN) tenutosi a Kitakyushu; Japan) [10.1007/978-3-031-57931-8_6].

Targeted and Automatic Deep Neural Networks Optimization for Edge Computing

Giovannesi, Luca^{Primo

Software};Proietti Mattia, Gabriele^{Secondo

Methodology};Beraldi, Roberto^{Ultimo

Supervision}

2024

Abstract

DNNs, commonly employed for complex tasks such as image and language processing, are increasingly sought for deployment on Internet of Things (IoT) devices. These devices operate with constrained resources, including limited computational power, memory, slower processors, and restricted energy requirements. Consequently, optimizing DNN models becomes crucial to minimize memory usage and computational time. However, traditional optimization methods require skilled professionals to manually fine-tune hyperparameters, striking a balance between efficiency and accuracy. This paper introduces an innovative solution for identifying optimal hyperparameters, focusing on the application of pruning, clusterization, and quantization. Initial empirical analyses were conducted to understand the relationships between model size, accuracy, pruning rate, and the number of clusters. Building upon these findings, we developed a framework that proposes two algorithms: one for discovering optimal pruning and the second for determining the optimal number of clusters. Through the adoption of efficient algorithms and the best quantization configuration, our tool integrates an optimization procedure that successfully reduces model size and inference time. The optimized models generated exhibit results comparable to, and in some cases surpass, those of more complex state-of-the-art approaches. The framework successfully optimized ResNet50, reducing the model size by 6.35x with a speedup of 2.91x, while only sacrificing 0.87% of the original accuracy.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2024
			
	Nome convegno
	
				International Conference on Advanced Information Networking and Applications (was ICOIN)
			
	Parole chiave
	
				Deep Neural Networks; DNN Acceleration; DNN Compression; Edge Computing
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Targeted and Automatic Deep Neural Networks Optimization for Edge Computing / Giovannesi, Luca; Proietti Mattia, Gabriele; Beraldi, Roberto. - 203:(2024), pp. 57-68. (Intervento presentato al  convegno International Conference on Advanced Information Networking and Applications (was ICOIN) tenutosi a Kitakyushu; Japan) [10.1007/978-3-031-57931-8_6].

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1709793

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

1

0

Catalogo dei prodotti della ricerca

Targeted and Automatic Deep Neural Networks Optimization for Edge Computing

Giovannesi, Luca^{Primo

Software};Proietti Mattia, Gabriele^{Secondo

Methodology};Beraldi, Roberto^{Ultimo

Supervision}

Primo

Software

Secondo

Methodology

Ultimo

Supervision

2024