Service-Oriented AI Model Compression for Computing Continuum Environments

Puglisi, A.; Monti, F.; Napoli, C.; Mecella, M.

doi:10.1007/978-981-95-5012-8_23

Modern neural networks often rely on over-parameterized architectures to ensure stability and accuracy, but in many real-world scenarios, such as the Internet of Things and edge devices, large models are difficult to deploy due to computational and memory limitations. Although compression techniques exist, they are rarely integrated into a service-oriented architecture that allows dynamically adapting AI models for heterogeneous devices. In this work, we propose a cloud continuum framework for AI model optimization as a service, where edge devices possibly send neural networks to the cloud, they are automatically compressed and returned in a lightweight version, ready for local execution. At the core of this process is ImproveNet, a method that structurally reduces the size of a neural network during training, without compromising its ability to solve the original task. Starting from a standard sized network, the system monitors performance during training and, once accuracy requirements are met, applies channel reduction and internal layer elimination, progressively simplifying the architecture. The resulting model is returned to the device, enabling AI-on-the-continuum deployment and execution.

Service-Oriented AI Model Compression for Computing Continuum Environments / Puglisi, A., Monti, F., Napoli, C., Mecella, M.. - 16320:(2026), pp. 303-318. (23rd International Conference on Service-Oriented Computing, ICSOC 2025 Shenzhen; China ) [10.1007/978-981-95-5012-8_23].

Service-Oriented AI Model Compression for Computing Continuum Environments

Puglisi A.;Monti F.;Napoli C.;Mecella M.

2026

Abstract

Modern neural networks often rely on over-parameterized architectures to ensure stability and accuracy, but in many real-world scenarios, such as the Internet of Things and edge devices, large models are difficult to deploy due to computational and memory limitations. Although compression techniques exist, they are rarely integrated into a service-oriented architecture that allows dynamically adapting AI models for heterogeneous devices. In this work, we propose a cloud continuum framework for AI model optimization as a service, where edge devices possibly send neural networks to the cloud, they are automatically compressed and returned in a lightweight version, ready for local execution. At the core of this process is ImproveNet, a method that structurally reduces the size of a neural network during training, without compromising its ability to solve the original task. Starting from a standard sized network, the system monitors performance during training and, once accuracy requirements are met, applies channel reduction and internal layer elimination, progressively simplifying the architecture. The resulting model is returned to the device, enabling AI-on-the-continuum deployment and execution.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2026
			
	Nome convegno
	
				23rd International Conference on Service-Oriented Computing, ICSOC 2025
			
	Parole chiave
	
				AI-as-a service; Computing Continuum; Neural Network Compression
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Service-Oriented AI Model Compression for Computing Continuum Environments / Puglisi, A., Monti, F., Napoli, C., Mecella, M.. - 16320:(2026), pp. 303-318. (23rd International Conference on Service-Oriented Computing, ICSOC 2025 Shenzhen; China ) [10.1007/978-981-95-5012-8_23].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Puglisi_Service-Oriented-AI_2026.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 14.92 MB Formato Adobe PDF Contatta l'autore	14.92 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1765414

Citazioni

ND

2

ND

Catalogo dei prodotti della ricerca

Service-Oriented AI Model Compression for Computing Continuum Environments

Puglisi A.;Monti F.;Napoli C.;Mecella M.

2026

Abstract

Scheda breve

Scheda completa

Citazioni

social impact

Catalogo dei prodotti della ricerca

Service-Oriented AI Model Compression for Computing Continuum Environments

Puglisi A.;Monti F.;Napoli C.;Mecella M.

2026

Abstract

Scheda breve Scheda completa

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa