Modern neural networks often rely on over-parameterized architectures to ensure stability and accuracy, but in many real-world scenarios, such as the Internet of Things and edge devices, large models are difficult to deploy due to computational and memory limitations. Although compression techniques exist, they are rarely integrated into a service-oriented architecture that allows dynamically adapting AI models for heterogeneous devices. In this work, we propose a cloud continuum framework for AI model optimization as a service, where edge devices possibly send neural networks to the cloud, they are automatically compressed and returned in a lightweight version, ready for local execution. At the core of this process is ImproveNet, a method that structurally reduces the size of a neural network during training, without compromising its ability to solve the original task. Starting from a standard sized network, the system monitors performance during training and, once accuracy requirements are met, applies channel reduction and internal layer elimination, progressively simplifying the architecture. The resulting model is returned to the device, enabling AI-on-the-continuum deployment and execution.
Service-Oriented AI Model Compression for Computing Continuum Environments / Puglisi, A.; Monti, F.; Napoli, C.; Mecella, M.. - 16320:(2026), pp. 303-318. ( 23rd International Conference on Service-Oriented Computing, ICSOC 2025 Shenzhen; China ) [10.1007/978-981-95-5012-8_23].
Service-Oriented AI Model Compression for Computing Continuum Environments
Puglisi A.
;Monti F.;Napoli C.;Mecella M.
2026
Abstract
Modern neural networks often rely on over-parameterized architectures to ensure stability and accuracy, but in many real-world scenarios, such as the Internet of Things and edge devices, large models are difficult to deploy due to computational and memory limitations. Although compression techniques exist, they are rarely integrated into a service-oriented architecture that allows dynamically adapting AI models for heterogeneous devices. In this work, we propose a cloud continuum framework for AI model optimization as a service, where edge devices possibly send neural networks to the cloud, they are automatically compressed and returned in a lightweight version, ready for local execution. At the core of this process is ImproveNet, a method that structurally reduces the size of a neural network during training, without compromising its ability to solve the original task. Starting from a standard sized network, the system monitors performance during training and, once accuracy requirements are met, applies channel reduction and internal layer elimination, progressively simplifying the architecture. The resulting model is returned to the device, enabling AI-on-the-continuum deployment and execution.| File | Dimensione | Formato | |
|---|---|---|---|
|
Puglisi_Service-Oriented-AI_2026.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
14.92 MB
Formato
Adobe PDF
|
14.92 MB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


