«Previous Article|Table of Contents|Next Article»

[1] Zhou Jian, Wei Xin, Liang Ruiyu, et al. Intelligibility evaluation of enhanced whisperin joint time-frequency domain [J]. Journal of Southeast University (English Edition), 2014, 30 (3): 261-266. [doi:10.3969/j.issn.1003-7985.2014.03.001]
Copy

Intelligibility evaluation of enhanced whisperin joint time-frequency domain()

联合时频域中增强后耳语音的可懂度评估

Share：

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:: 30
Issue:: 2014 3

Page:: 261-266

Research Field:: Information and Communication Engineering

Publishing date:: 2014-09-30

Info

Title:: Intelligibility evaluation of enhanced whisperin joint time-frequency domain

: 联合时频域中增强后耳语音的可懂度评估

Author(s):: Zhou Jian^{1, 2}, Wei Xin³, Liang Ruiyu⁴, Zhao Li²; ¹Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei 230601, China
²Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University, Nanjing 210096, China
³College of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
⁴College of Computer and Information, Hohai University, Nanjing 210098, China

: 周健^{1, 2}, 魏昕³, 梁瑞宇⁴, 赵力²; ¹安徽大学智能计算与信号处理教育部重点实验室, 合肥 230601; ²东南大学水声信号处理教育部重点实验室, 南京 210096; ³南京邮电大学通信与信息工程学院, 南京 210003; ⁴河海大学计算机与信息学院, 南京210098

Keywords:: whispered speech enhancement; intelligibility evaluation; real-valued discrete Gabor transform; joint time-frequency analysis

: 耳语音增强; 可懂度评价; 实值离散Gabor变换; 联合时频分析

PACS:: TN912.35

DOI:: 10.3969/j.issn.1003-7985.2014.03.001

Abstract:: Some factors influencing the intelligibility of the enhanced whisper in the joint time-frequency domain are evaluated. Specifically, both the spectrum density and different regions of the enhanced spectrum are analyzed. Experimental results show that for a spectrum of some density, the joint time-frequency gain-modification based speech enhancement algorithm achieves significant improvement in intelligibility. Additionally, the spectrum region where the estimated spectrum is smaller than the clean spectrum, is the most important region contributing to intelligibility improvement for the enhanced whisper. The spectrum region where the estimated spectrum is larger than twice the size of the clean spectrum is detrimental to speech intelligibility perception within the whisper context.

: 对在联合时频域影响增强后耳语音可懂度的因素进行了评估.分析了耳语音时频谱密度和增强后耳语音时频谱中不同区域对耳语音可懂度的影响.实验结果表明, 在基于增益修正的时频域语音增强算法中, 采用密度较高的耳语音谱可提高增强后耳语音可懂度.此外, 在增强后的耳语音的时频谱中, 频谱幅度小于干净耳语音时频谱的频谱区域对增强后的耳语音的可懂度提高最为重要, 而那些频谱幅度大于2倍干净耳语音频谱的频谱区域对增强后的耳语音的可懂度具有消极作用.

References:

[1] Remijn G, Kikuchi M, Yoshimura Y, et al. Cortical hemodynamic response patterns to normal and whispered speech [J]. The Journal of the Acoustical Society of America, 2013, 133(5):3606-3606.
[2] Ruggles D, Riddell A, Freyman R L, et al. Intelligibility of voiced and whispered speech in noise in listeners with and without musical training [C]//Proceedings of Meetings on Acoustic. Montreal, Canada, 2013: 50-64.
[3] Sarria-Paja M, Falk T H. Whispered speech detection in noise using auditory-inspired modulation spectrum features [J]. IEEE Signal Processing Letters, 2013, 20(8):783-786.
[4] Loizou P. Speech enhancement: theory and practice [M]. New York: CRC, 2007.
[5] Hu Y, Loizou P. A comparative intelligibility study of single-microphone noise reduction algorithms [J]. The Journal of the Acoustical Society of America, 2007, 122(3):1777-1786.
[6] Li J, Yang L, Zhang J, et al. Comparative intelligibility investigation of single-channel noise-reduction algorithms for Chinese, Japanese, and English [J]. The Journal of the Acoustical Society of America, 2011, 129(5):3291-3301.
[7] Loizou P, Kim G. Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(1):47-56.
[8] Wang D, Kjems U, Pedersen M, et al. Speech intelligibility in background noise with ideal binary time-frequency masking[J]. The Journal of the Acoustical Society of America, 2009, 125(4): 2336-2347.
[9] Tao L, Kwan H. Multirate-based fast parallel algorithms for 2-D DHT-based real-valued discrete Gabor transform [J]. IEEE Transactions on Image Processing, 2012, 21(7):3306-3311.
[10] Cohen I, Berdugo B. Speech enhancement for non-stationary noise environments [J]. Signal Processing, 2001, 81(11):2403-2418.
[11] Taal C, Hendriks R, Heusdens R, et al. An algorithm for intelligibility prediction of time-frequency weighted noisy speech [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(7):2125-2136.
[12] Ephraim Y, Malah D. Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator [J]. IEEE Transactions on Acoustics, Speech and Signal Processing, 1984, 32(6):1109-1121.
[13] Scalart P. Speech enhancement based on a priori signal to noise estimation [C]//Proceedings of Acoustics, Speech, and Signal Processing. Atlanta, USA, 1996: 629-632.

Memo

Memo:: Biographies: Zhou Jian(1981—), male, doctor, lecturer; Zhao Li(corresponding author), male, doctor, professor, zhaoli@seu.edu.cn.
Foundation items: The National Natural Science Foundation of China(No.61301295, 61273266, 61301219, 61201326, 61003131), the Natural Science Foundation of Anhui Province(No.1308085QF100, 1408085MF113), the Natural Science Foundation of Jiangsu Province(No.BK20130241), the Natural Science Foundation of Higher Education Institutions of Jiangsu Province(No.12KJB510021), the Doctoral Fund of Anhui University.
Citation: Zhou Jian, Wei Xin, Liang Ruiyu, et al. Intelligibility evaluation of enhanced whisper in joint time-frequency domain[J].Journal of Southeast University(English Edition), 2014, 30(3):261-266.[doi:10.3969/j.issn.1003-7985.2014.03.001]

Last Update: 2014-09-20

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Info

References:

Memo

Common functions

Navigate

Tools

Statistics