Catalogo dei prodotti della ricerca

In the new paradigm of semantic communication (SC), the focus is on delivering meanings behind bits by extracting semantic information from raw data. Recent advances in data-to-text models facilitate language-oriented SC, particularly for text-transformed image communication via image-to-text (I2T) encoding and text-to-image (T2I) decoding. However, although semantically aligned, the text is too coarse to pre-cisely capture sophisticated visual features such as spatial locations, color, and texture, incurring a significant percep-tual difference between intended and reconstructed images. To address this limitation, in this paper, we propose a novel language-oriented SC framework that communicates both text and a compressed image embedding and combines them using a latent diffusion model to reconstruct the intended image. Experimental results validate the potential of our approach, which transmits only 2.09% of the original image size while achieving higher perceptual similarities in noisy communication channels compared to a baseline SC method that communicates only through text. The code is available at https://github.com/ispamm/Img2Img-SC/.

Language-Oriented Semantic Latent Representation for Image Transmission / Cicchetti, G., Grassucci, E., Park, J., Choi, J., Barbarossa, S., Comminiello, D.. - (2024). (IEEE International Workshop on Machine Learning for Signal Processing, MLSP London; UK ) [10.1109/MLSP58920.2024.10734812].

Language-Oriented Semantic Latent Representation for Image Transmission

Giordano Cicchetti;Eleonora Grassucci;Jihong Park;Jinho Choi;Sergio Barbarossa;Danilo Comminiello

2024

Abstract

In the new paradigm of semantic communication (SC), the focus is on delivering meanings behind bits by extracting semantic information from raw data. Recent advances in data-to-text models facilitate language-oriented SC, particularly for text-transformed image communication via image-to-text (I2T) encoding and text-to-image (T2I) decoding. However, although semantically aligned, the text is too coarse to pre-cisely capture sophisticated visual features such as spatial locations, color, and texture, incurring a significant percep-tual difference between intended and reconstructed images. To address this limitation, in this paper, we propose a novel language-oriented SC framework that communicates both text and a compressed image embedding and combines them using a latent diffusion model to reconstruct the intended image. Experimental results validate the potential of our approach, which transmits only 2.09% of the original image size while achieving higher perceptual similarities in noisy communication channels compared to a baseline SC method that communicates only through text. The code is available at https://github.com/ispamm/Img2Img-SC/.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2024
			
	Nome convegno
	
				IEEE International Workshop on Machine Learning for Signal Processing, MLSP
			
	Parole chiave
	
				encodings; generative model; generative semantic communication; image communication; semantic coding; semantic communication; semantics information; text modeling; visual feature
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Language-Oriented Semantic Latent Representation for Image Transmission / Cicchetti, G., Grassucci, E., Park, J., Choi, J., Barbarossa, S., Comminiello, D.. - (2024). (IEEE International Workshop on Machine Learning for Signal Processing, MLSP London; UK ) [10.1109/MLSP58920.2024.10734812].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Cicchetti_Language-oriented_2025.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 19.27 MB Formato Adobe PDF Contatta l'autore	19.27 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1741690

Citazioni

ND

30

15

social impact