Modern neural networks often rely on over-parameterized architectures to ensure stability and accuracy, but in many real-world scenarios, such as the Internet of Things and edge devices, large models are difficult to deploy due to computational and memory limitations. Although compression techniques exist, they are rarely integrated into a service-oriented architecture that allows dynamically adapting AI models for heterogeneous devices. In this work, we propose a cloud continuum framework for AI model optimization as a service, where edge devices possibly send neural networks to the cloud, they are automatically compressed and returned in a lightweight version, ready for local execution. At the core of this process is ImproveNet, a method that structurally reduces the size of a neural network during training, without compromising its ability to solve the original task. Starting from a standard sized network, the system monitors performance during training and, once accuracy requirements are met, applies channel reduction and internal layer elimination, progressively simplifying the architecture. The resulting model is returned to the device, enabling AI-on-the-continuum deployment and execution.

Service-Oriented AI Model Compression for Computing Continuum Environments / Puglisi, A.; Monti, F.; Napoli, C.; Mecella, M.. - 16320:(2026), pp. 303-318. ( 23rd International Conference on Service-Oriented Computing, ICSOC 2025 Shenzhen; China ) [10.1007/978-981-95-5012-8_23].

Service-Oriented AI Model Compression for Computing Continuum Environments

Puglisi A.
;
Monti F.;Napoli C.;Mecella M.
2026

Abstract

Modern neural networks often rely on over-parameterized architectures to ensure stability and accuracy, but in many real-world scenarios, such as the Internet of Things and edge devices, large models are difficult to deploy due to computational and memory limitations. Although compression techniques exist, they are rarely integrated into a service-oriented architecture that allows dynamically adapting AI models for heterogeneous devices. In this work, we propose a cloud continuum framework for AI model optimization as a service, where edge devices possibly send neural networks to the cloud, they are automatically compressed and returned in a lightweight version, ready for local execution. At the core of this process is ImproveNet, a method that structurally reduces the size of a neural network during training, without compromising its ability to solve the original task. Starting from a standard sized network, the system monitors performance during training and, once accuracy requirements are met, applies channel reduction and internal layer elimination, progressively simplifying the architecture. The resulting model is returned to the device, enabling AI-on-the-continuum deployment and execution.
2026
23rd International Conference on Service-Oriented Computing, ICSOC 2025
AI-as-a service; Computing Continuum; Neural Network Compression
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Service-Oriented AI Model Compression for Computing Continuum Environments / Puglisi, A.; Monti, F.; Napoli, C.; Mecella, M.. - 16320:(2026), pp. 303-318. ( 23rd International Conference on Service-Oriented Computing, ICSOC 2025 Shenzhen; China ) [10.1007/978-981-95-5012-8_23].
File allegati a questo prodotto
File Dimensione Formato  
Puglisi_Service-Oriented-AI_2026.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 14.92 MB
Formato Adobe PDF
14.92 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1765414
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact