Text Independent Automatic Speaker Recognition System Using Mel-Frequency Cepstrum Coefficients and Gaussian Mixture Models

Maesa, Alfredo; Garzia, Fabio; Scarpiniti, Michele; Cusani, Roberto

doi:10.4236/jis.2012.34041

The aim of this paper is to show the accuracy and time results of a text independent automatic speaker recognition (ASR) system, based on Mel-Frequency Cepstrum Coefficients (MFCC) and Gaussian Mixture Models (GMM), in order to develop a security control access gate. 450 speakers were randomly extracted from the Voxforge.org audio database, their utterances have been improved using spectral subtraction, then MFCC were extracted and these coefficients were statistically analyzed by GMM in order to build each profile. For each speaker two different speech files were used: the first one to build the profile database, the second one to test the system performance. The accuracy achieved by the proposed approach is greater than 96% and the time spent for a single test run, implemented in Matlab language, is about 2 seconds on a common PC.

Text Independent Automatic Speaker Recognition System Using Mel-Frequency Cepstrum Coefficients and Gaussian Mixture Models / Alfredo, M., Garzia, F., Scarpiniti, M., Cusani, R.. - In: JOURNAL OF INFORMATION SECURITY. - ISSN 2153-1234. - 03:04(2012), pp. 335-340. [10.4236/jis.2012.34041]

Text Independent Automatic Speaker Recognition System Using Mel-Frequency Cepstrum Coefficients and Gaussian Mixture Models

Alfredo Maesa;GARZIA, FABIO;SCARPINITI, MICHELE;CUSANI, Roberto

2012

Abstract

The aim of this paper is to show the accuracy and time results of a text independent automatic speaker recognition (ASR) system, based on Mel-Frequency Cepstrum Coefficients (MFCC) and Gaussian Mixture Models (GMM), in order to develop a security control access gate. 450 speakers were randomly extracted from the Voxforge.org audio database, their utterances have been improved using spectral subtraction, then MFCC were extracted and these coefficients were statistically analyzed by GMM in order to build each profile. For each speaker two different speech files were used: the first one to build the profile database, the second one to test the system performance. The accuracy achieved by the proposed approach is greater than 96% and the time spent for a single test run, implemented in Matlab language, is about 2 seconds on a common PC.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2012
			
	Parole chiave
	
				access control; voice recognition; automatic speaker recognition; biometrics
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Text Independent Automatic Speaker Recognition System Using Mel-Frequency Cepstrum Coefficients and Gaussian Mixture Models / Alfredo, M., Garzia, F., Scarpiniti, M., Cusani, R.. - In: JOURNAL OF INFORMATION SECURITY. - ISSN 2153-1234. - 03:04(2012), pp. 335-340. [10.4236/jis.2012.34041]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/499586

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

Catalogo dei prodotti della ricerca