|Table of Contents|

[1] Wang Qingyun, Zhao Li, Liang Ruiyu, et al. Annoyance-type speech emotion detectionin working environment [J]. Journal of Southeast University (English Edition), 2013, 29 (4): 366-371. [doi:10.3969/j.issn.1003-7985.2013.04.003]
Copy

Annoyance-type speech emotion detectionin working environment()
工作环境中的语音烦躁情绪检测方法
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
29
Issue:
2013 4
Page:
366-371
Research Field:
Information and Communication Engineering
Publishing date:
2013-12-20

Info

Title:
Annoyance-type speech emotion detectionin working environment
工作环境中的语音烦躁情绪检测方法
Author(s):
Wang Qingyun1 2 Zhao Li1 Liang Ruiyu1 Zhang Xiaodan1
1School of Information Science and Engineering, Southeast University, Nanjing 210096, China
2School of Communication Engineering, Nanjing Institute of Technology, Nanjing 211167, China
王青云1 2 赵力1 梁瑞宇1 张潇丹1
1东南大学信息科学与工程学院, 南京 210096; 2南京工程学院通信工程学院, 南京 211167
Keywords:
speech emotion detection annoyance type sentence length shuffled frog leaping algorithm
语音情感检测 烦躁类型 句子长度 蛙跳算法
PACS:
TN912.3
DOI:
10.3969/j.issn.1003-7985.2013.04.003
Abstract:
In order to recognize people’s annoyance emotions in the working environment and evaluate emotional well-being, emotional speech in a work environment is induced to obtain adequate samples of emotional speech, and a Mandarin database with two thousands samples is built. In searching for annoyance-type emotion features, the prosodic feature and the voice quality feature parameters of the emotional statements are extracted first. Then an improved back propagation(BP)neural network based on the shuffled frog leaping algorithm(SFLA)is proposed to recognize the emotion. The recognition capability of the BP, radical basis function(RBF)and the SFLA neural networks are compared experimentally. The results show that the recognition ratio of the SFLA neural network is 4.7% better than that of the BP neural network and 4.3% better than that of the RBF neural network. The experimental results demonstrate that the random initial data trained by the SFLA can optimize the connection weights and thresholds of the neural network, speed up the convergence and improve the recognition rate.
为了检测工作人员的烦躁情绪, 实现情感状态的评价, 通过在工作环境中诱发情感语音, 获取了足够的测试样本, 建立了2000条样本的工作环境情感语音数据库.在检测烦躁情绪过程中, 首先提取语音的韵律特征和音质特征参数, 然后利用基于蛙跳算法的改进的BP神经网络进行烦躁情绪识别.实验比较了BP, RBF和SFLA神经网络的性能, 结果显示SFLA神经网络的识别率比BP神经网络高4.7%, 比RBF神经网络高4.3%.实验结果表明, 使用蛙跳算法训练随机初始数据可以优化神经网络的连接权重和阈值, 加快收敛速度, 提高识别率.

References:

[1] Barbara A, Spellman D, Willingham T. Current directions in cognitive science [M]. Beijing: Beijing Normal University Press, 2007: 1-5.
[2] Vinciarelli A, Pantic M, Bourlard H. Social signal processing: survey of an emerging domain[J]. Image and Vision Computing, 2009, 27(12): 1743-1759.
[3] Zeng Z, Pantic M, Roisman G I, et al. A survey of affect recognition methods: audio, visual, and spontaneous expressions[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(1): 39-58.
[4] Jones C M, Jonsson I M. Automatic recognition of affective cues in the speech of car drivers to allow appropriate responses[C]//Proceedings of the 17th Australia Conference on Computer-Human Interaction: Citizens Online: Considerations for Today and the Future. Canberra, Australia, 2005: 1-10.
[5] Clavel C, Vasilescu I, Devillers L, et al. Fear-type emotion recognition for future audio-based surveillance systems[J]. Speech Communication, 2008, 50(6): 487-503.
[6] Ang J, Dhillon R, Krupski A, et al. Prosody-based automatic detection of annoyance and frustration in human-computer dialog[C]//7th International Conference on Spoken Language Processing. Denver, CO, USA, 2002: 16-20.
[7] Mitsuyoshi S, Ren F, Tanaka Y, et al. Non-verbal voice emotion analysis system [J]. International Journal of Innovative Computing, Information and Control, 2006, 2(4): 819-830.
[8] Pao T L, Chen Y T, Yeh J H, et al. Emotion recognition and evaluation from Mandarin speech signals[J]. International Journal of Innovative Computing, Information and Control, 2008, 4(7): 1695-1709.
[9] Eusuff M, Lansey K, Pasha F. Shuffled frog-leaping algorithm: a memetic meta-heuristic for discrete optimization[J]. Engineering Optimization, 2006, 38(2): 129-154.
[10] Amiri B, Fathian M, Maroosi A. Application of shuffled frog-leaping algorithm on clustering[J]. The International Journal of Advanced Manufacturing Technology, 2009, 45(1/2): 199-209.
[11] Huang C W, Jin Y, Zhao Y, et al. Recognition of practical emotion from elicited speech[C]//IEEE 1st International Conference on Information Science and Engineering. Nanjing, China, 2009: 639-642.
[12] Huang C W, Jin Y, Wang Q Y, et al. Speech emotion recognition based on decomposition of feature space and information fusion[J]. Signal Processing, 2010, 26(6): 835-842.
[13] Huang C W, Jin Y, Zhao Y, et al. Design and establishment of practical speech emotion database[J]. Technical Acoustics, 2010, 29(1): 63-68.
[14] Gobl C, Ní Chasaide A. The role of voice quality in communicating emotion, mood and attitude[J]. Speech Communication, 2003, 40(1): 189-212.
[15] Johnstone T, van Reekum C M, Hird K, et al. Affective speech elicited with a computer game[J]. Emotion, 2005, 5(4): 513-518.
[16] Xu F Y. Adaptive shuffled frog-leaping algorithm for motion estimation[D]. Pingtung, China: National Pingtung Institute of Commerce, 2013.

Memo

Memo:
Biography: Wang Qingyun(1972—), female, doctor, associate professor, wangqingyun@vip.163.com.
Foundation items: The National Natural Science Foundation of China(No.61375028, 61301219), China Postdoctoral Science Foundation(No.2012M520973), the Scientific Research Funds of Nanjing Institute of Technology(No.ZKJ201202).
Citation: Wang Qingyun, Zhao Li, Liang Ruiyu, et al.Annoyance-type speech emotion detection in working environment[J].Journal of Southeast University(English Edition), 2013, 29(4):366-371.[doi:10.3969/j.issn.1003-7985.2013.04.003]
Last Update: 2013-12-20