The ability of artificial intelligence techniques to build synthesized brand new videos or to alter the facial expression of already existing ones has been efficiently demonstrated in the literature. The identification of such new threat generally known as Deepfake, but consisting of different techniques, is fundamental in multimedia forensics. In fact this kind of manipulated information could undermine and easily distort the public opinion on a certain person or about a specific event. Thus, in this paper, a new technique able to distinguish synthetic generated portrait videos from natural ones is introduced by exploiting inconsistencies due to the prediction error in the re-encoding phase. In particular, features based on inter-frame prediction error have been investigated jointly with a Long Short-Term Memory (LSTM) model network able to learn the temporal correlation among consecutive frames. Preliminary results have demonstrated that such sequence-based approach, used to distinguish between original and manipulated videos, highlights promising performances.

Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos / Amerini, I.; Caldelli, R.. - (2020), pp. 97-102. (Intervento presentato al convegno 8th ACM Workshop on Information Hiding and Multimedia Security, IH and MMSec 2020 tenutosi a Denver; CO USA) [10.1145/3369412.3395070].

Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos

Amerini I.
;
2020

Abstract

The ability of artificial intelligence techniques to build synthesized brand new videos or to alter the facial expression of already existing ones has been efficiently demonstrated in the literature. The identification of such new threat generally known as Deepfake, but consisting of different techniques, is fundamental in multimedia forensics. In fact this kind of manipulated information could undermine and easily distort the public opinion on a certain person or about a specific event. Thus, in this paper, a new technique able to distinguish synthetic generated portrait videos from natural ones is introduced by exploiting inconsistencies due to the prediction error in the re-encoding phase. In particular, features based on inter-frame prediction error have been investigated jointly with a Long Short-Term Memory (LSTM) model network able to learn the temporal correlation among consecutive frames. Preliminary results have demonstrated that such sequence-based approach, used to distinguish between original and manipulated videos, highlights promising performances.
2020
8th ACM Workshop on Information Hiding and Multimedia Security, IH and MMSec 2020
deep learning; LSTM; multimedia forensics; prediction error; synthetic video; video manipulation
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Exploiting Prediction Error Inconsistencies through LSTM-based Classifiers to Detect Deepfake Videos / Amerini, I.; Caldelli, R.. - (2020), pp. 97-102. (Intervento presentato al convegno 8th ACM Workshop on Information Hiding and Multimedia Security, IH and MMSec 2020 tenutosi a Denver; CO USA) [10.1145/3369412.3395070].
File allegati a questo prodotto
File Dimensione Formato  
Amerini_Exploiting_2020.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 6.93 MB
Formato Adobe PDF
6.93 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1494770
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 46
  • ???jsp.display-item.citation.isi??? ND
social impact