|Table of Contents|

[1] Zhou Jian, Wei Xin, Liang Ruiyu, et al. Intelligibility evaluation of enhanced whisperin joint time-frequency domain [J]. Journal of Southeast University (English Edition), 2014, 30 (3): 261-266. [doi:10.3969/j.issn.1003-7985.2014.03.001]
Copy

Intelligibility evaluation of enhanced whisperin joint time-frequency domain()
联合时频域中增强后耳语音的可懂度评估
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
30
Issue:
2014 3
Page:
261-266
Research Field:
Information and Communication Engineering
Publishing date:
2014-09-30

Info

Title:
Intelligibility evaluation of enhanced whisperin joint time-frequency domain
联合时频域中增强后耳语音的可懂度评估
Author(s):
Zhou Jian1 2 Wei Xin3 Liang Ruiyu4 Zhao Li2
1Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei 230601, China
2Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University, Nanjing 210096, China
3College of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
4College of Computer and Information, Hohai University, Nanjing 210098, China
周健1 2 魏昕3 梁瑞宇4 赵力2
1安徽大学智能计算与信号处理教育部重点实验室, 合肥 230601; 2东南大学水声信号处理教育部重点实验室, 南京 210096; 3南京邮电大学通信与信息工程学院, 南京 210003; 4河海大学计算机与信息学院, 南京210098
Keywords:
whispered speech enhancement intelligibility evaluation real-valued discrete Gabor transform joint time-frequency analysis
耳语音增强 可懂度评价 实值离散Gabor变换 联合时频分析
PACS:
TN912.35
DOI:
10.3969/j.issn.1003-7985.2014.03.001
Abstract:
Some factors influencing the intelligibility of the enhanced whisper in the joint time-frequency domain are evaluated. Specifically, both the spectrum density and different regions of the enhanced spectrum are analyzed. Experimental results show that for a spectrum of some density, the joint time-frequency gain-modification based speech enhancement algorithm achieves significant improvement in intelligibility. Additionally, the spectrum region where the estimated spectrum is smaller than the clean spectrum, is the most important region contributing to intelligibility improvement for the enhanced whisper. The spectrum region where the estimated spectrum is larger than twice the size of the clean spectrum is detrimental to speech intelligibility perception within the whisper context.
对在联合时频域影响增强后耳语音可懂度的因素进行了评估.分析了耳语音时频谱密度和增强后耳语音时频谱中不同区域对耳语音可懂度的影响.实验结果表明, 在基于增益修正的时频域语音增强算法中, 采用密度较高的耳语音谱可提高增强后耳语音可懂度.此外, 在增强后的耳语音的时频谱中, 频谱幅度小于干净耳语音时频谱的频谱区域对增强后的耳语音的可懂度提高最为重要, 而那些频谱幅度大于2倍干净耳语音频谱的频谱区域对增强后的耳语音的可懂度具有消极作用.

References:

[1] Remijn G, Kikuchi M, Yoshimura Y, et al. Cortical hemodynamic response patterns to normal and whispered speech [J]. The Journal of the Acoustical Society of America, 2013, 133(5):3606-3606.
[2] Ruggles D, Riddell A, Freyman R L, et al. Intelligibility of voiced and whispered speech in noise in listeners with and without musical training [C]//Proceedings of Meetings on Acoustic. Montreal, Canada, 2013: 50-64.
[3] Sarria-Paja M, Falk T H. Whispered speech detection in noise using auditory-inspired modulation spectrum features [J]. IEEE Signal Processing Letters, 2013, 20(8):783-786.
[4] Loizou P. Speech enhancement: theory and practice [M]. New York: CRC, 2007.
[5] Hu Y, Loizou P. A comparative intelligibility study of single-microphone noise reduction algorithms [J]. The Journal of the Acoustical Society of America, 2007, 122(3):1777-1786.
[6] Li J, Yang L, Zhang J, et al. Comparative intelligibility investigation of single-channel noise-reduction algorithms for Chinese, Japanese, and English [J]. The Journal of the Acoustical Society of America, 2011, 129(5):3291-3301.
[7] Loizou P, Kim G. Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(1):47-56.
[8] Wang D, Kjems U, Pedersen M, et al. Speech intelligibility in background noise with ideal binary time-frequency masking[J]. The Journal of the Acoustical Society of America, 2009, 125(4): 2336-2347.
[9] Tao L, Kwan H. Multirate-based fast parallel algorithms for 2-D DHT-based real-valued discrete Gabor transform [J]. IEEE Transactions on Image Processing, 2012, 21(7):3306-3311.
[10] Cohen I, Berdugo B. Speech enhancement for non-stationary noise environments [J]. Signal Processing, 2001, 81(11):2403-2418.
[11] Taal C, Hendriks R, Heusdens R, et al. An algorithm for intelligibility prediction of time-frequency weighted noisy speech [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(7):2125-2136.
[12] Ephraim Y, Malah D. Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator [J]. IEEE Transactions on Acoustics, Speech and Signal Processing, 1984, 32(6):1109-1121.
[13] Scalart P. Speech enhancement based on a priori signal to noise estimation [C]//Proceedings of Acoustics, Speech, and Signal Processing. Atlanta, USA, 1996: 629-632.

Memo

Memo:
Biographies: Zhou Jian(1981—), male, doctor, lecturer; Zhao Li(corresponding author), male, doctor, professor, zhaoli@seu.edu.cn.
Foundation items: The National Natural Science Foundation of China(No.61301295, 61273266, 61301219, 61201326, 61003131), the Natural Science Foundation of Anhui Province(No.1308085QF100, 1408085MF113), the Natural Science Foundation of Jiangsu Province(No.BK20130241), the Natural Science Foundation of Higher Education Institutions of Jiangsu Province(No.12KJB510021), the Doctoral Fund of Anhui University.
Citation: Zhou Jian, Wei Xin, Liang Ruiyu, et al. Intelligibility evaluation of enhanced whisper in joint time-frequency domain[J].Journal of Southeast University(English Edition), 2014, 30(3):261-266.[doi:10.3969/j.issn.1003-7985.2014.03.001]
Last Update: 2014-09-20