Biometric security systems based on predefined speech sentences are extremely common nowadays, particularly in low-cost applications where the simplicity of the hardware involved is a great advantage. Audio spoofing verification is the problem of detecting whether a speech segment acquired from such a system is genuine, or whether it was synthesized or modified by a computer in order to make it sound like an authorized person. Developing countermeasures for spoofing attacks is clearly essential for having effective biometric and security systems based on audio features, all the more significant due to recent advances in generative machine learning. Nonetheless, the problem is complicated by the possible lack of knowledge on the technique(s) used to put forward the attack, so that anti-spoofing systems should be able to withstand also spoofing attacks that were not considered explicitly in the training stage. In this paper, we analyze the use of deep recurrent networks applied to this task, i. e. networks made by the successive combination of multiple feedforward and recurrent layers. These networks are routinely used in speech recognition and language identification but, to the best of our knowledge, they were never considered for this specific problem. We evaluate several architectures on the dataset released for the ASVspoof 2015 challenge last year. We show that, by working with very standard feature extraction routines and with a minimum amount of fine-tuning, the networks can already reach very promising error rates, comparable to stateof-the-art approaches, paving the way to further investigations on the problem using deep RNN models.

On the use of deep recurrent neural networks for detecting audio spoofing attacks / Scardapane, S; Stoffl, L; Rohrbein, F; Uncini, A. - 2017:(2017), pp. 3483-3490. (Intervento presentato al convegno 2017 International Joint Conference on Neural Networks tenutosi a Anchorage; United States) [10.1109/IJCNN.2017.7966294].

On the use of deep recurrent neural networks for detecting audio spoofing attacks

Scardapane, S;Uncini, A
2017

Abstract

Biometric security systems based on predefined speech sentences are extremely common nowadays, particularly in low-cost applications where the simplicity of the hardware involved is a great advantage. Audio spoofing verification is the problem of detecting whether a speech segment acquired from such a system is genuine, or whether it was synthesized or modified by a computer in order to make it sound like an authorized person. Developing countermeasures for spoofing attacks is clearly essential for having effective biometric and security systems based on audio features, all the more significant due to recent advances in generative machine learning. Nonetheless, the problem is complicated by the possible lack of knowledge on the technique(s) used to put forward the attack, so that anti-spoofing systems should be able to withstand also spoofing attacks that were not considered explicitly in the training stage. In this paper, we analyze the use of deep recurrent networks applied to this task, i. e. networks made by the successive combination of multiple feedforward and recurrent layers. These networks are routinely used in speech recognition and language identification but, to the best of our knowledge, they were never considered for this specific problem. We evaluate several architectures on the dataset released for the ASVspoof 2015 challenge last year. We show that, by working with very standard feature extraction routines and with a minimum amount of fine-tuning, the networks can already reach very promising error rates, comparable to stateof-the-art approaches, paving the way to further investigations on the problem using deep RNN models.
2017
2017 International Joint Conference on Neural Networks
spoofing; biometric systems; deep neural networks
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
On the use of deep recurrent neural networks for detecting audio spoofing attacks / Scardapane, S; Stoffl, L; Rohrbein, F; Uncini, A. - 2017:(2017), pp. 3483-3490. (Intervento presentato al convegno 2017 International Joint Conference on Neural Networks tenutosi a Anchorage; United States) [10.1109/IJCNN.2017.7966294].
File allegati a questo prodotto
File Dimensione Formato  
Scardapane_On-the-use_2017.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 208.41 kB
Formato Adobe PDF
208.41 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1335728
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? 11
social impact