|Table of Contents|

[1] Xu Xianghua, Zhu Jie, Guo Qiang,. Speaker-independent speech recognitionbased on HMM state-restructuring method [J]. Journal of Southeast University (English Edition), 2004, 20 (4): 427-430. [doi:10.3969/j.issn.1003-7985.2004.04.007]
Copy

Speaker-independent speech recognitionbased on HMM state-restructuring method()
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
20
Issue:
2004 4
Page:
427-430
Research Field:
Information and Communication Engineering
Publishing date:
2004-12-30

Info

Title:
Speaker-independent speech recognitionbased on HMM state-restructuring method
Author(s):
Xu Xianghua Zhu Jie Guo Qiang
Department of Electronic Engineering, Shanghai Jiaotong University, Shanghai 200030, China
Keywords:
speech recognition hidden Markov model expectation maximization algorithm HMM Tookit(HTK)
PACS:
TN912.34;TP391.42
DOI:
10.3969/j.issn.1003-7985.2004.04.007
Abstract:
Based on confusions between hidden Markov model(HMM)states, a state-restructuring method is proposed. In the method, HMM states are restructured by sharing Gaussian components with their related states, and the re-estimation to the increased-parameters, i.e., the inter-state weights, is derived under the expectation maximization(EM)framework. Experiments are performed on speaker-independent, large vocabulary, continuous Mandarin speech recognition. Experimental results show that the state-restructured systems outperform the baseline, and achieve significant improvement on recognition accuracy compared with the conventional parameter-increasing method. Such comparative results confirm that the state-restructuring method is efficient.

References:

[1] Young S, Jansen J, Odell J, et al. The HTK book [EB/OL].http: //htk.eng.cam.ac.uk/. 2003-10-03/2004-02-16.
[2] Luo X O, Jelinek F. Probabilistic classification of HMM states for large vocabulary continuous speech recognition [A]. In: Proc of ICASSP [C]. Phoenix, Arizona, 1999, 1: 353-356.
[3] Rabiner L, Juang B H. Fundamentals of speech recognition [M]. New Jersey: Prentice Hall, 1993. 339-342.
[4] Moon T K. The expectation-maximization algorithm [J]. IEEE Signal Processing Magazine, 1996, 13(1): 47-60.
[5] Chang E, Shi Y, Zhou J L, et al. Speech lab in a box: a Mandarin speech toolbox to jumpstart speech related research [A]. In: Proc of Eurospeech [C]. Aalborg, Denmark, 2001, 3: 2779-2782.
[6] Reichl W, Chou W. Robust decision tree state tying for continuous speech recognition [J]. IEEE Trans Speech and Audio Processing, 2000, 8(5): 555-566.

Memo

Memo:
Biographies: Xu Xianghua(1977—), female, graduate; Zhu Jie(corresponding author), male, doctor, professor, zhujie@sjtu.edu.cn.
Last Update: 2004-12-20