A low-bit-rate codec based on an ergodic hidden Markov model (EHMM) is presented. A 256-state autoregressive Gaussian EHMM is trained on speech uttered by eight different speakers by means of the Baum-Welch algorithm. Initial estimates are obtained from vector quantization. The resulting EHMM is then utilized for a Viterbi decoding of incoming speech data. The state sequence obtained is frame synchronously encoded. The bit rate is gradually lowered by cutting off low-probability transitions and thus reducing the destination state encoding bit allocation requirements. The encoded spectra sequence is used, on the receiver side of the codec, for linear predictive coding synthesis. Global entropy and distortion measures for different bit rates are reported and compared to vector quantization results. Informal listening tests were performed by comparing the results of the proposed method at various bit rates and vector quantization using the same material
A finite state Markov quantizer for speech coding / Falaschi, Alessandro; Giustiniani, M.; Pierucci, P.. - STAMPA. - (1990), pp. 205-208. (Intervento presentato al convegno International Conference on Acoustics, Speech, and Signal Processing, 1990. ICASSP-90. tenutosi a Albuquerque, NM) [10.1109/ICASSP.1990.115574].
A finite state Markov quantizer for speech coding
Falaschi, Alessandro;
1990
Abstract
A low-bit-rate codec based on an ergodic hidden Markov model (EHMM) is presented. A 256-state autoregressive Gaussian EHMM is trained on speech uttered by eight different speakers by means of the Baum-Welch algorithm. Initial estimates are obtained from vector quantization. The resulting EHMM is then utilized for a Viterbi decoding of incoming speech data. The state sequence obtained is frame synchronously encoded. The bit rate is gradually lowered by cutting off low-probability transitions and thus reducing the destination state encoding bit allocation requirements. The encoded spectra sequence is used, on the receiver side of the codec, for linear predictive coding synthesis. Global entropy and distortion measures for different bit rates are reported and compared to vector quantization results. Informal listening tests were performed by comparing the results of the proposed method at various bit rates and vector quantization using the same materialFile | Dimensione | Formato | |
---|---|---|---|
Falaschi_Finite-state-Markov_1990.pdf
accesso aperto
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
450.02 kB
Formato
Adobe PDF
|
450.02 kB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.