In these last years, neural networks are becoming the basis for different kinds of applications and this is mainly due to the stunning performances they offer. Nevertheless, all that glitters is not gold: such tools have demonstrated to be highly sensitive to malicious approaches such as gradient manipulation or the injection of adversarial samples. In particular, another kind of attack that can be performed is to poison a neural network during the training time by injecting a perceptually barely visible trigger signal in a small portion of the dataset (target class), to actually create a backdoor into the trained model. Such a backdoor can be then exploited to redirect all the predictions to the chosen target class at test time. In this work, a novel backdoor attack which resorts to image watermarking algorithms to generate a trigger signal is presented. The watermark signal is almost unperceivable and is embedded in a portion of images of the target class; two different watermarking algorithms have been tested. Experimental results carried out on datasets like MNIST and GTSRB provide satisfactory performances in terms of attack success rate and introduced distortion.

Image Watermarking Backdoor Attacks in CNN-Based Classification Tasks / Abbate, Giovanbattista; Amerini, Irene; Caldelli, Roberto. - 13646:(2023), pp. 4-16. (Intervento presentato al convegno ICPR 2022 International Workshops and Challenges tenutosi a Montreal, QC, Canada) [10.1007/978-3-031-37745-7_1].

Image Watermarking Backdoor Attacks in CNN-Based Classification Tasks

Amerini, Irene;
2023

Abstract

In these last years, neural networks are becoming the basis for different kinds of applications and this is mainly due to the stunning performances they offer. Nevertheless, all that glitters is not gold: such tools have demonstrated to be highly sensitive to malicious approaches such as gradient manipulation or the injection of adversarial samples. In particular, another kind of attack that can be performed is to poison a neural network during the training time by injecting a perceptually barely visible trigger signal in a small portion of the dataset (target class), to actually create a backdoor into the trained model. Such a backdoor can be then exploited to redirect all the predictions to the chosen target class at test time. In this work, a novel backdoor attack which resorts to image watermarking algorithms to generate a trigger signal is presented. The watermark signal is almost unperceivable and is embedded in a portion of images of the target class; two different watermarking algorithms have been tested. Experimental results carried out on datasets like MNIST and GTSRB provide satisfactory performances in terms of attack success rate and introduced distortion.
2023
ICPR 2022 International Workshops and Challenges
backdoor attack; image watermarking; CNN
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Image Watermarking Backdoor Attacks in CNN-Based Classification Tasks / Abbate, Giovanbattista; Amerini, Irene; Caldelli, Roberto. - 13646:(2023), pp. 4-16. (Intervento presentato al convegno ICPR 2022 International Workshops and Challenges tenutosi a Montreal, QC, Canada) [10.1007/978-3-031-37745-7_1].
File allegati a questo prodotto
File Dimensione Formato  
Abbate_Image_2022.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 10.11 MB
Formato Adobe PDF
10.11 MB Adobe PDF   Contatta l'autore
Abbate_Frontespizio-indice_2022.pdf

accesso aperto

Note: https://link.springer.com/book/10.1007/978-3-031-37745-7
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 125.91 kB
Formato Adobe PDF
125.91 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1693696
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact