Catalogo dei prodotti della ricerca

A low-bit-rate codec based on an ergodic hidden Markov model (EHMM) is presented. A 256-state autoregressive Gaussian EHMM is trained on speech uttered by eight different speakers by means of the Baum-Welch algorithm. Initial estimates are obtained from vector quantization. The resulting EHMM is then utilized for a Viterbi decoding of incoming speech data. The state sequence obtained is frame synchronously encoded. The bit rate is gradually lowered by cutting off low-probability transitions and thus reducing the destination state encoding bit allocation requirements. The encoded spectra sequence is used, on the receiver side of the codec, for linear predictive coding synthesis. Global entropy and distortion measures for different bit rates are reported and compared to vector quantization results. Informal listening tests were performed by comparing the results of the proposed method at various bit rates and vector quantization using the same material

A finite state Markov quantizer for speech coding / Falaschi, A., Giustiniani, M., Pierucci, P.. - STAMPA. - (1990), pp. 205-208. (International Conference on Acoustics, Speech, and Signal Processing, 1990. ICASSP-90. Albuquerque, NM ) [10.1109/ICASSP.1990.115574].

A finite state Markov quantizer for speech coding

Falaschi, Alessandro;Giustiniani, M.;Pierucci, P.

1990

Abstract

A low-bit-rate codec based on an ergodic hidden Markov model (EHMM) is presented. A 256-state autoregressive Gaussian EHMM is trained on speech uttered by eight different speakers by means of the Baum-Welch algorithm. Initial estimates are obtained from vector quantization. The resulting EHMM is then utilized for a Viterbi decoding of incoming speech data. The state sequence obtained is frame synchronously encoded. The bit rate is gradually lowered by cutting off low-probability transitions and thus reducing the destination state encoding bit allocation requirements. The encoded spectra sequence is used, on the receiver side of the codec, for linear predictive coding synthesis. Global entropy and distortion measures for different bit rates are reported and compared to vector quantization results. Informal listening tests were performed by comparing the results of the proposed method at various bit rates and vector quantization using the same material

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				1990
			
	Nome convegno
	
				International Conference on Acoustics, Speech, and Signal Processing, 1990. ICASSP-90.
			
	Parole chiave
	
				ergodic hidden Markov model; vector quantization; speech coding
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				A finite state Markov quantizer for speech coding / Falaschi, A., Giustiniani, M., Pierucci, P.. - STAMPA. - (1990), pp. 205-208. (International Conference on Acoustics, Speech, and Signal Processing, 1990. ICASSP-90. Albuquerque, NM ) [10.1109/ICASSP.1990.115574].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Falaschi_Finite-state-Markov_1990.pdf accesso aperto Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 450.02 kB Formato Adobe PDF	450.02 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/494473

Citazioni

ND

1

0

social impact