«Previous Article|Table of Contents|Next Article»

[1] Zha Cheng, Tao Huawei, Zhang Xinran, et al. A novel speech emotion recognition algorithm based on combinationof emotion data field and ant colony search strategy [J]. Journal of Southeast University (English Edition), 2016, 32 (2): 158-163. [doi:10.3969/j.issn.1003-7985.2016.02.005]
Copy

A novel speech emotion recognition algorithm based on combinationof emotion data field and ant colony search strategy()

Share：

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:: 32
Issue:: 2016 2

Page:: 158-163

Research Field:: Information and Communication Engineering

Publishing date:: 2016-06-20

Info

Title:: A novel speech emotion recognition algorithm based on combinationof emotion data field and ant colony search strategy

Author(s):: Zha Cheng¹; 2; Tao Huawei¹; Zhang Xinran¹; Zhou Lin¹; Zhao Li¹; Yang Ping²; ¹Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University, Nanjing 210096, China
²College of Big Data and Informaton Engineering, Guizhou University, Guiyang 550025, China

Keywords:: speech emotion recognition; emotional data field; ant colony search; human-machine interaction

PACS:: TN912.3

DOI:: 10.3969/j.issn.1003-7985.2016.02.005

Abstract:: In order to effectively conduct emotion recognition from spontaneous, non-prototypical and unsegmented speech so as to create a more natural human-machine interaction; a novel speech emotion recognition algorithm based on the combination of the emotional data field(EDF)and the ant colony search(ACS)strategy, called the EDF-ACS algorithm, is proposed. More specifically, the inter-relationship among the turn-based acoustic feature vectors of different labels are established by using the potential function in the EDF. To perform the spontaneous speech emotion recognition, the artificial colony is used to mimic the turn-based acoustic feature vectors. Then, the canonical ACS strategy is used to investigate the movement direction of each artificial ant in the EDF, which is regarded as the emotional label of the corresponding turn-based acoustic feature vector. The proposed EDF-ACS algorithm is evaluated on the continueous audio/visual emotion challenge(AVEC)2012 dataset, which contains the spontaneous, non-prototypical and unsegmented speech emotion data. The experimental results show that the proposed EDF-ACS algorithm outperforms the existing state-of-the-art algorithm in turn-based speech emotion recognition.

References:

[1] Jin Q, Li C, Chen S, et al. Speech emotion recognition with acoustic and lexical features[C]//IEEE International Conference on Acoustics, Speech and Signal Processing. Brisbane, Australia, 2015: 4749-4753.
[2] Ramakrishnan S, El Emary I M M. Speech emotion recognition approaches in human computer interaction[J]. Telecommunication Systems, 2013, 52(3): 1467-1478.
[3] Lu H, Frauendorfer D, Rabbi M, et al. StressSense: Detecting stress in unconstrained acoustic environments using smartphones[C]//Proceedings of the 2012 ACM Conference on Ubiquitous Computing. Pittsburgh, PA, USA, 2012: 351-360.
[4] Lee J S, Shin D H. A study on the interaction between human and smart devices based on emotion recognition[C]//Communications in Computer and Information Science. Berlin: Springer, 2013: 352-356.
[5] Anagnostopoulos C N, Iliou T, Giannoukos I. Features and classifiers for emotion recognition from speech: A survey from 2000 to 2011[J]. Artificial Intelligence Review, 2015, 43(2): 155-177. DOI:10.1007/s10462-012-9368-5.
[6] Ingale A B, Chaudhari D S. Speech emotion recognition[J]. International Journal of Soft Computing and Engineering, 2012, 2(1): 235-238.
[7] Lanjewar R B, Chaudhari D S. Speech emotion recognition: a review[J]. International Journal of Innovative Technology and Exploring Engineering, 2013, 2(4): 68-71.
[8] Huang C W, Wi D, Zhang X J, et al. Cascaded projection of Gaussian mixture model for emotion recognition in speech and EGG signal[J]. Journal of Southeast University(English Edition), 2015, 31(3):320-326.
[9] Wöllmer M, Schuller B, Eyben F, et al. Combining long short-term memory and dynamic bayesian networks for incremental emotion-sensitive artificial listening[J]. IEEE Journal of Selected Topics in Signal Processing, 2010, 4(5): 867-881. DOI:10.1109/jstsp.2010.2057200.
[10] Gharavian D, Sheikhan M, Nazerieh A, et al. Speech emotion recognition using FCBF feature selection method and GA-optimized fuzzy ARTMAP neural network[J]. Neural Computing and Applications, 2012, 21(8): 2115-2126. DOI:10.1007/s00521-011-0643-1.
[11] Wu D, Parsons T D, Narayanan S S. Acoustic feature analysis in speech emotion primitives estimation[C]//INTERSPEECH 2010, Conference of the International Speech Communication Association. Makuhari, Chiba, Japan, 2010:785-788.
[12] Swain M, Sahoo S, Routray A, et al. Study of feature combination using HMM and SVM for multilingual Odiya speech emotion recognition[J]. International Journal of Speech Technology, 2015, 18(3): 387-393. DOI:10.1007/s10772-015-9275-7.
[13] Khan M, Goskula T, Nasiruddin M, et al. Comparison between KNN and SVM method for speech emotion recognition[J]. International Journal on Computer Science and Engineering, 2011, 3(2): 607-611.
[14] Meng H, Bianchi-Berthouze N. Naturalistic affective expression classification by a multi-stage approach based on hidden markov models[M]//Affective Computing and Intelligent Interaction. Berlin: Springer, 2011: 378-387.
[15] Ramirez G A, Baltrušaitis T, Morency L P. Modeling latent discriminative dynamic of multi-dimensional affective signals[M]//Affective Computing and Intelligent Interaction. Berlin: Springer, 2011: 396-406.
[16] Gers F A, Schraudolph N N, Schmidhuber J. Learning precise timing with LSTM recurrent networks[J]. The Journal of Machine Learning Research, 2003, 3(1): 115-143.
[17] Schuller B, Valster M, Eyben F, et al. Avec 2012: The continuous audio/visual emotion challenge[C]//Proceedings of the 14th ACM International Conference on Multimodal Interaction. New York, USA:ACM, 2012: 449-456.
[18] Cowie R, Douglas-Cowie E, Savvidou S, et al. “FEELTRACE”: An instrument for recording perceived emotion in real time[C]//ITRW on Speech and Emotion. Newcastle, Northern Ireland, UK, 2000:19-24.
[19] Hall M A. Correlation-based feature selection for machine learning[D]. Hamilton, Zealand: Department of Computer Science, The University of Waikato, 1999.
[20] Gan W Y, Li D Y, Wang J M. Hierarchical clustering method based on data fields[J]. Acta Electronica Sinica, 2006, 34(2): 258-262.
[21] Weinberger K Q, Saul L K. Distance metric learning for large margin nearest neighbor classification[J]. The Journal of Machine Learning Research, 2009, 10(1): 207-244.
[22] Parpinelli R S, Lopes H S, Freitas A A. An ant colony based system for data mining: Applications to medical data[C]//Proceedings of the Genetic and Evolutionary Computation Conference. San Francisco, CA, USA, 2001: 791-797.

Memo

Memo:: Biographies: Zha Cheng(1979—), male, graduate; Zhao Li(corresponding author), male, doctor, professor, zhaoli@seu.edu.cn.
Foundation items: The National Natural Science Foundation of China(No.61231002, 61273266, 61571106), the Foundation of the Department of Science and Technology of Guizhou Province(No.[2015]7637).
Citation: Zha Cheng, Tao Huawei, Zhang Xinran, et al.A novel speech emotion recognition algorithm based on combination of emotion data field and ant colony search strategy[J].Journal of Southeast University(English Edition), 2016, 32(2):158-163.doi:10.3969/j.issn.1003-7985.2016.02.005.

Last Update: 2016-06-20

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Info

References:

Memo

Common functions

Navigate

Tools

Statistics