DNNs, commonly employed for complex tasks such as image and language processing, are increasingly sought for deployment on Internet of Things (IoT) devices. These devices operate with constrained resources, including limited computational power, memory, slower processors, and restricted energy requirements. Consequently, optimizing DNN models becomes crucial to minimize memory usage and computational time. However, traditional optimization methods require skilled professionals to manually fine-tune hyperparameters, striking a balance between efficiency and accuracy. This paper introduces an innovative solution for identifying optimal hyperparameters, focusing on the application of pruning, clusterization, and quantization. Initial empirical analyses were conducted to understand the relationships between model size, accuracy, pruning rate, and the number of clusters. Building upon these findings, we developed a framework that proposes two algorithms: one for discovering optimal pruning and the second for determining the optimal number of clusters. Through the adoption of efficient algorithms and the best quantization configuration, our tool integrates an optimization procedure that successfully reduces model size and inference time. The optimized models generated exhibit results comparable to, and in some cases surpass, those of more complex state-of-the-art approaches. The framework successfully optimized ResNet50, reducing the model size by 6.35x with a speedup of 2.91x, while only sacrificing 0.87% of the original accuracy.

Targeted and Automatic Deep Neural Networks Optimization for Edge Computing / Giovannesi, Luca; Proietti Mattia, Gabriele; Beraldi, Roberto. - 203:(2024), pp. 57-68. ( International Conference on Advanced Information Networking and Applications (AINA) Kitakyushu; Japan ) [10.1007/978-3-031-57931-8_6].

Targeted and Automatic Deep Neural Networks Optimization for Edge Computing

Giovannesi, Luca
Primo
Software
;
Proietti Mattia, Gabriele
Secondo
Methodology
;
Beraldi, Roberto
Ultimo
Supervision
2024

Abstract

DNNs, commonly employed for complex tasks such as image and language processing, are increasingly sought for deployment on Internet of Things (IoT) devices. These devices operate with constrained resources, including limited computational power, memory, slower processors, and restricted energy requirements. Consequently, optimizing DNN models becomes crucial to minimize memory usage and computational time. However, traditional optimization methods require skilled professionals to manually fine-tune hyperparameters, striking a balance between efficiency and accuracy. This paper introduces an innovative solution for identifying optimal hyperparameters, focusing on the application of pruning, clusterization, and quantization. Initial empirical analyses were conducted to understand the relationships between model size, accuracy, pruning rate, and the number of clusters. Building upon these findings, we developed a framework that proposes two algorithms: one for discovering optimal pruning and the second for determining the optimal number of clusters. Through the adoption of efficient algorithms and the best quantization configuration, our tool integrates an optimization procedure that successfully reduces model size and inference time. The optimized models generated exhibit results comparable to, and in some cases surpass, those of more complex state-of-the-art approaches. The framework successfully optimized ResNet50, reducing the model size by 6.35x with a speedup of 2.91x, while only sacrificing 0.87% of the original accuracy.
2024
International Conference on Advanced Information Networking and Applications (AINA)
Deep Neural Networks; DNN Acceleration; DNN Compression; Edge Computing
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Targeted and Automatic Deep Neural Networks Optimization for Edge Computing / Giovannesi, Luca; Proietti Mattia, Gabriele; Beraldi, Roberto. - 203:(2024), pp. 57-68. ( International Conference on Advanced Information Networking and Applications (AINA) Kitakyushu; Japan ) [10.1007/978-3-031-57931-8_6].
File allegati a questo prodotto
File Dimensione Formato  
Giovannesi-preprint_Targeted_2024.pdf

accesso aperto

Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review)
Licenza: Creative commons
Dimensione 429.08 kB
Formato Adobe PDF
429.08 kB Adobe PDF
Givannesi_Targeted-and-Automatic_2024.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.98 MB
Formato Adobe PDF
2.98 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1709793
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact