The aim of this paper is to show the accuracy and time results of a text independent automatic speaker recognition (ASR) system, based on Mel-Frequency Cepstrum Coefficients (MFCC) and Gaussian Mixture Models (GMM), in order to develop a security control access gate. 450 speakers were randomly extracted from the Voxforge.org audio database, their utterances have been improved using spectral subtraction, then MFCC were extracted and these coefficients were statistically analyzed by GMM in order to build each profile. For each speaker two different speech files were used: the first one to build the profile database, the second one to test the system performance. The accuracy achieved by the proposed approach is greater than 96% and the time spent for a single test run, implemented in Matlab language, is about 2 seconds on a common PC.

Text Independent Automatic Speaker Recognition System Using Mel-Frequency Cepstrum Coefficients and Gaussian Mixture Models / Alfredo, Maesa; Garzia, Fabio; Scarpiniti, Michele; Cusani, Roberto. - In: JOURNAL OF INFORMATION SECURITY. - ISSN 2153-1234. - 03:04(2012), pp. 335-340. [10.4236/jis.2012.34041]

Text Independent Automatic Speaker Recognition System Using Mel-Frequency Cepstrum Coefficients and Gaussian Mixture Models

GARZIA, FABIO;SCARPINITI, MICHELE;CUSANI, Roberto
2012

Abstract

The aim of this paper is to show the accuracy and time results of a text independent automatic speaker recognition (ASR) system, based on Mel-Frequency Cepstrum Coefficients (MFCC) and Gaussian Mixture Models (GMM), in order to develop a security control access gate. 450 speakers were randomly extracted from the Voxforge.org audio database, their utterances have been improved using spectral subtraction, then MFCC were extracted and these coefficients were statistically analyzed by GMM in order to build each profile. For each speaker two different speech files were used: the first one to build the profile database, the second one to test the system performance. The accuracy achieved by the proposed approach is greater than 96% and the time spent for a single test run, implemented in Matlab language, is about 2 seconds on a common PC.
2012
access control; voice recognition; automatic speaker recognition; biometrics
01 Pubblicazione su rivista::01a Articolo in rivista
Text Independent Automatic Speaker Recognition System Using Mel-Frequency Cepstrum Coefficients and Gaussian Mixture Models / Alfredo, Maesa; Garzia, Fabio; Scarpiniti, Michele; Cusani, Roberto. - In: JOURNAL OF INFORMATION SECURITY. - ISSN 2153-1234. - 03:04(2012), pp. 335-340. [10.4236/jis.2012.34041]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/499586
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact