Differences in forgery attributes of images generated in CNN-synthesized and image-editing domains are large, and such differences make a unified image forgery detection and localization (IFDL) challenging. To this end, we present a hierarchical fine-grained formulation for IFDL representation learning. Specifically, we first represent forgery attributes of a manipulated image with multiple labels at different levels. Then, we perform fine-grained classification at these levels using the hierarchical dependency between them. As a result, the algorithm is encouraged to learn both comprehensive features and the inherent hierarchical nature of different forgery attributes, thereby improving the IFDL representation. In this work, we propose a Language-guided Hierarchical Fine-grained IFDL, denoted as HiFi-Net++. Specifically, HiFi-Net++ contains four components: multi-branch feature extractor, language-guided forgery localization enhancer, as well as classification and localization modules. Specifically, each branch of the multi-branch feature extractor learns to classify forgery attributes at one level, while localization and classification modules segment the pixel-level forgery region and detect image-level forgery, respectively. In addition, the language-guided forgery localization enhancer (LFLE), containing image and text encoders learned by contrastive language-image pre-training (CLIP), is used to further enrich the IFDL representation. More formally, LFLE takes specifically designed texts and the given image as multi-modal inputs and then generates the visual embedding and manipulation score maps, which are used to further improve HiFi-Net++ manipulation localization performance. Lastly, we construct a hierarchical fine-grained dataset to facilitate our study. We demonstrate the effectiveness of our method on $8$ different benchmarks for both tasks of IFDL and forgery attribute classification. Our source code and dataset can be found: https://github.com/CHELSEA234/HiFi_IFDL.

Language-guided Hierarchical Fine-grained Image Forgery Detection and Localization / Guo, Xiao; Liu, Xiaohong; Masi, Iacopo; Liu, Xiaoming. - In: INTERNATIONAL JOURNAL OF COMPUTER VISION. - ISSN 0920-5691. - (2024).

Language-guided Hierarchical Fine-grained Image Forgery Detection and Localization

Iacopo Masi
Penultimo
;
2024

Abstract

Differences in forgery attributes of images generated in CNN-synthesized and image-editing domains are large, and such differences make a unified image forgery detection and localization (IFDL) challenging. To this end, we present a hierarchical fine-grained formulation for IFDL representation learning. Specifically, we first represent forgery attributes of a manipulated image with multiple labels at different levels. Then, we perform fine-grained classification at these levels using the hierarchical dependency between them. As a result, the algorithm is encouraged to learn both comprehensive features and the inherent hierarchical nature of different forgery attributes, thereby improving the IFDL representation. In this work, we propose a Language-guided Hierarchical Fine-grained IFDL, denoted as HiFi-Net++. Specifically, HiFi-Net++ contains four components: multi-branch feature extractor, language-guided forgery localization enhancer, as well as classification and localization modules. Specifically, each branch of the multi-branch feature extractor learns to classify forgery attributes at one level, while localization and classification modules segment the pixel-level forgery region and detect image-level forgery, respectively. In addition, the language-guided forgery localization enhancer (LFLE), containing image and text encoders learned by contrastive language-image pre-training (CLIP), is used to further enrich the IFDL representation. More formally, LFLE takes specifically designed texts and the given image as multi-modal inputs and then generates the visual embedding and manipulation score maps, which are used to further improve HiFi-Net++ manipulation localization performance. Lastly, we construct a hierarchical fine-grained dataset to facilitate our study. We demonstrate the effectiveness of our method on $8$ different benchmarks for both tasks of IFDL and forgery attribute classification. Our source code and dataset can be found: https://github.com/CHELSEA234/HiFi_IFDL.
2024
image manipulation detection; deep learning; media forensics
01 Pubblicazione su rivista::01a Articolo in rivista
Language-guided Hierarchical Fine-grained Image Forgery Detection and Localization / Guo, Xiao; Liu, Xiaohong; Masi, Iacopo; Liu, Xiaoming. - In: INTERNATIONAL JOURNAL OF COMPUTER VISION. - ISSN 0920-5691. - (2024).
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1723591
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact