|Table of Contents|

[1] Wang Rugang, Xu Xinzhou, Huang Chengwei, et al. Speech emotion recognition via discriminant-cascadingdimensionality reduction [J]. Journal of Southeast University (English Edition), 2016, 32 (2): 151-157. [doi:10.3969/j.issn.1003-7985.2016.02.004]
Copy

Speech emotion recognition via discriminant-cascadingdimensionality reduction()
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
32
Issue:
2016 2
Page:
151-157
Research Field:
Information and Communication Engineering
Publishing date:
2016-06-20

Info

Title:
Speech emotion recognition via discriminant-cascadingdimensionality reduction
Author(s):
Wang Rugang1 2 Xu Xinzhou1 Huang Chengwei1 Wu Chen1 Zhang Xinran1 Zhao Li1
1Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University, Nanjing 210096, China
2 College of Information Engineering, Yancheng Institute of Technology, Yancheng 224051, China
Keywords:
speech emotion recognition discriminant-cascading locality preserving projections discriminant analysis dimensionality reduction
PACS:
TN911.72
DOI:
10.3969/j.issn.1003-7985.2016.02.004
Abstract:
In order to accurately identify speech emotion information, the discriminant-cascading effect in dimensionality reduction of speech emotion recognition is investigated. Based on the existing locality preserving projections and graph embedding framework, a novel discriminant-cascading dimensionality reduction method is proposed, which is named discriminant-cascading locality preserving projections(DCLPP). The proposed method specifically utilizes supervised embedding graphs and it keeps the original space for the inner products of samples to maintain enough information for speech emotion recognition. Then, the kernel DCLPP(KDCLPP)is also proposed to extend the mapping form. Validated by the experiments on the corpus of EMO-DB and eNTERFACE’05, the proposed method can clearly outperform the existing common dimensionality reduction methods, such as principal component analysis(PCA), linear discriminant analysis(LDA), locality preserving projections(LPP), local discriminant embedding(LDE), graph-based Fisher analysis(GbFA)and so on, with different categories of classifiers.

References:

[1] Alonso J B, Cabrera J, Medina M, et al. New approach in quantification of emotional intensity from the speech signal: Emotional temperature[J]. Expert Systems with Applications, 2015, 42(24): 9554-9564. DOI:10.1016/j.eswa.2015.07.062.
[2] Raptis S, Karabetsos S, Chalamandaris A, et al. A framework towards expressive speech analysis and synthesis with preliminary results[J]. Journal on Multimodal User Interfaces, 2015, 9(4):387-394.
[3] Kantrowitz J T, Hoptman M J, Leitman D I, et al. Neural substrates of auditory emotion recognition deficits in schizophrenia.[J]. Journal of the Society for Neuroscience, 2015, 35(44):14909-14921. DOI:10.1523/JNEUROSCI.4603-14.2015.
[4] Mao Q, Dong M, Huang Z, et al. Learning salient features for speech emotion recognition using convolutional neural networks[J]. IEEE Transactions on Multimedia, 2014, 16(8):2203-2213. DOI:10.1109/tmm.2014.2360798.
[5] Arruti A, Cearreta I, Alvarez A, et al. Feature selection for speech emotion recognition in Spanish and Basque: On the use of machine learning to improve human-computer interaction.[J]. Plos One, 2014, 9(10):e108975. DOI:10.1371/journal.pone.0108975.
[6] Ooi C S, Seng K P, Ang L M, et al. A new approach of audio emotion recognition[J]. Expert Systems with Applications, 2014, 41(13):5858-5869. DOI:10.1016/j.eswa.2014.03.026.
[7] Yan J. Speech emotion recognition based on sparse representation[J]. Archives of Acoustics, 2013, 38(4):465-470. DOI:10.2478/aoa-2013-0055.
[8] Xu X, Huang C, Wu C, et al. Graph learning based speaker independent speech emotion recognition[J]. Advances in Electrical & Computer Engineering, 2014, 14(2):17-22. DOI:10.4316/aece.2014.02003.
[9] Xu X, Deng J, Zheng W, et al. Dimensionality reduction for speech emotion features by multiscale kernels[C]//Annual Conference of International Speech Communication Association. Dresden, Germany, 2015:1532-1536.
[10] Zha C, Zhang X R, Zhao L, et al. Speaker-independent speech emotion recognition based multiple kernel learning of collaborative representation[J]. IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences, 2016, 99(3):756-759. DOI:10.1587/transfun.e99.a.756.
[11] Roweis S, Saul L. Nonlinear dimensionality reduction by locally linear embedding[J]. Science, 2000, 290: 2323-2326. DOI:10.1126/science.290.5500.2323.
[12] He X, Niyogi P. Locality preserving projections[J]. Advances in Neural Information Processing Systems 16(NIPS 2003). Vancouver and Whistle, Canada, 2003.
[13] Cui Y, Fan L. A novel supervised dimensionality reduction algorithm: Graph-based Fisher analysis[J]. Pattern Recognition, 2012, 45(4):1471-1481. DOI:10.1016/j.patcog.2011.10.006.
[14] Belkin M, Niyogi P. Laplacian eigenmaps and spectral techniques for embedding and clustering[C]// Advances in Neural Information Processing Systems 14(NIPS 2001). Vancouver, Canada, 2001.
[15] Yu X, Wang X, Liu B. Supervised kernel neighborhood preserving projections for radar target recognition[J]. Signal Processing, 2008, 88(9): 2335-2339. DOI:10.1016/j.sigpro.2007.11.015.
[16] Burkhardt F, Paeschke A, Rolfes M, et al. A database of German emotional speech[C]//Eurospeech, European Conference on Speech Communication and Technology. Lisbon, Portugal, 2005:1517-1520.
[17] Martin O, Kotsia I, Macq B. The eNTERFACE’05 audio-visual emotion database[C]//22nd International Conference on Data Engineering Workshops. Atlanta, GA, USA, 2006.

Memo

Memo:
Biographies: Wang Rugang(1975—), male, doctor, associate professor; Zhao Li(corresponding author), male, doctor, professor, zhaoli@seu.edu.cn.
Foundation items: The National Natural Science Foundation of China(No.61231002, 61273266), the Ph.D. Program Foundation of Ministry of Education of China(No.20110092130004), China Postdoctoral Science Foundation(No. 2015M571637).
Citation: Wang Rugang, Xu Xinzhou, Huang Chengwei, et al. Speech emotion recognition via discriminant-cascading dimensionality reduction[J].Journal of Southeast University(English Edition), 2016, 32(2):151-157.doi:10.3969/j.issn.1003-7985.2016.02.004.
Last Update: 2016-06-20