|Table of Contents|

[1] Tashpolat Nizamidin, Zhao Li, Zhang Mingyang, et al. Emotion recognition of Uyghur speechusing uncertain linear discriminant analysis [J]. Journal of Southeast University (English Edition), 2017, 33 (4): 437-443. [doi:10.3969/j.issn.1003-7985.2017.04.008]

Emotion recognition of Uyghur speechusing uncertain linear discriminant analysis()

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

2017 4
Research Field:
Computer Science and Engineering
Publishing date:


Emotion recognition of Uyghur speechusing uncertain linear discriminant analysis
Tashpolat Nizamidin1 2 Zhao Li1 Zhang Mingyang1 Xu Xinzhou1 Askar Hamdulla2
1Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University, Nanjing 210096, China
2School of Information Science and Engineering, Xinjiang University, Urumqi 830046, China
Uyghur language speech emotion corpus pitch formant uncertain linear discriminant analysis(ULDA)
To achieve efficient and compact low-dimensional features for speech emotion recognition, a novel feature reduction method using uncertain linear discriminant analysis is proposed. Using the same principles as for conventional linear discriminant analysis(LDA), uncertainties of the noisy or distorted input data are employed in order to estimate maximally discriminant directions. The effectiveness of the proposed uncertain LDA(ULDA)is demonstrated in the Uyghur speech emotion recognition task. The emotional features of Uyghur speech, especially, the fundamental frequency and formant, are analyzed in the collected emotional data. Then, ULDA is employed in dimensionality reduction of emotional features and better performance is achieved compared with other dimensionality reduction techniques. The speech emotion recognition of Uyghur is implemented by feeding the low-dimensional data to support vector machine(SVM)based on the proposed ULDA. The experimental results show that when employing an appropriate uncertainty estimation algorithm, uncertain LDA outperforms the conventional LDA counterpart on Uyghur speech emotion recognition.


[1] El Ayadi M, Kamel M S, Karray F. Survey on speech emotion recognition: Features, classification schemes, and databases [J]. Pattern Recognition, 2011, 44(3): 572-587. DOI:10.1016/j.patcog.2010.09.020.
[2] Chu D, Liao L Z, Ng M K, et al. Incremental linear discriminant analysis: A fast algorithm and comparisons [J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(11): 2716-2735. DOI:10.1109/TNNLS.2015.2391201.
[3] Quan C, Wan D, Zhang B, et al. Reduce the dimensions of emotional features by principal component analysis for speech emotion recognition [C]//Proceedings of the 2013 IEEE/SICE International Symposium on System Integration. Kobe, Japan, 2013: 222-226. DOI:10.1109/sii.2013.6776653.
[4] Saeidi R, Astudillo R F, Kolossa D. Uncertain LDA: Including observation uncertainties in discriminative transforms [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(7): 1479-1488. DOI:10.1016/j.patcog.2010.09.020.
[5] Soldera J, Behaine C A R, Scharcanski J. Customized orthogonal locality preserving projections with soft-margin maximization for face recognition [J]. IEEE Transactions on Instrumentation and Measurement, 2015, 64(9): 2417-2426. DOI:10.1109/TIM.2015.2415012.
[6] Zhou Y, Peng J, Chen C L P. Dimension reduction using spatial and spectral regularized local discriminant embedding for hyperspectral image classification [J]. IEEE Transactions on Geoscience and Remote Sensing, 2015, 53(2): 1082-1095.
[7] Li W, Du Q. Laplacian regularized collaborative graph for discriminant analysis of hyperspectral imagery [J]. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(12): 7066-7076. DOI:10.1109/tgrs.2016.2594848.
[8] Burkhardt F, Paeschke A, Rolfes M, et al. A database of German emotional speech [C]//Proceedings of the 2005 INTERSPEECH. Lisbon, Portugal, 2005:1517-1520.
[9] McGilloway S, Cowie R, Douglas-Cowie E, et al. Approaching automatic recognition of emotion from voice: A rough benchmark [C]//Proceedings of the 2000 ISCA Workshop on Speech and Emotion: A Conceptual Framework for Research. Newcastle, Northern Ireland, UK, 2000:207-212.
[10] Ablimit M, Eli M, Kawahara T. Partly supervised Uyghur morpheme segmentation [C]//Oriental Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques Workshop. Kyoto, Japan, 2008: 71-76.
[11] Pan S, Tao J, Li Y. The CASIA audio emotion recognition method for audio/visual emotion challenge 2011 [C]//Affective Computing and Intelligent Interaction Fourth International Conference. Memphis, USA, 2011:388-395. DOI:10.1007/978-3-642-24571-8_50.
[12] Eyben F, Wollmer M, Schuller B. Opensmile: The munich versatile and fast open-source audio feature extractor [C]//ACM International Conference on Multimedia. Firenze, Italy, 2010: 1459-1462.
[13] Xu X Z, Deng J, Zheng W M, et al. Dimensionality reduction for speech emotion features by multiscale kernels [C]//Proceedings of Annual Conference of the International Speech Communication Association. Dresden, Germany, 2015:1532-1536.
[14] Wu S, Falk T H, Chan W Y. Automatic speech emotion recognition using modulation spectral features [J]. Speech Communication, 2011, 53(5): 768-785. DOI:10.1016/j.specom.2010.08.013.


Biographies: Tashpolat Nizamidin(1988—), male, graduate; Zhao Li(corresponding author), male, doctor, professor, zhaoli@seu.edu.cn.
Foundation item: The National Natural Science Foundation of China(No.61673108, 61231002).
Citation: Tashpolat Nizamidin, Zhao Li, Zhang Mingyang, et al. Emotion recognition of Uyghur speech using uncertain linear discriminant analysis[J].Journal of Southeast University(English Edition), 2017, 33(4):437-443.DOI:10.3969/j.issn.1003-7985.2017.04.008.
Last Update: 2017-12-20