|Table of Contents|

[1] Zhou Jian, Wei Xin, Liang Ruiyu, et al. Intelligibility evaluation of enhanced whisperin joint time-frequency domain [J]. Journal of Southeast University (English Edition), 2014, 30 (3): 261-266. [doi:10.3969/j.issn.1003-7985.2014.03.001]
Copy

Intelligibility evaluation of enhanced whisperin joint time-frequency domain()
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
30
Issue:
2014 3
Page:
261-266
Research Field:
Information and Communication Engineering
Publishing date:
2014-09-30

Info

Title:
Intelligibility evaluation of enhanced whisperin joint time-frequency domain
Author(s):
Zhou Jian1 2 Wei Xin3 Liang Ruiyu4 Zhao Li2
1Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei 230601, China
2Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, Southeast University, Nanjing 210096, China
3College of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
4College of Computer and Information, Hohai University, Nanjing 210098, China
Keywords:
whispered speech enhancement intelligibility evaluation real-valued discrete Gabor transform joint time-frequency analysis
PACS:
TN912.35
DOI:
10.3969/j.issn.1003-7985.2014.03.001
Abstract:
Some factors influencing the intelligibility of the enhanced whisper in the joint time-frequency domain are evaluated. Specifically, both the spectrum density and different regions of the enhanced spectrum are analyzed. Experimental results show that for a spectrum of some density, the joint time-frequency gain-modification based speech enhancement algorithm achieves significant improvement in intelligibility. Additionally, the spectrum region where the estimated spectrum is smaller than the clean spectrum, is the most important region contributing to intelligibility improvement for the enhanced whisper. The spectrum region where the estimated spectrum is larger than twice the size of the clean spectrum is detrimental to speech intelligibility perception within the whisper context.

References:

[1] Remijn G, Kikuchi M, Yoshimura Y, et al. Cortical hemodynamic response patterns to normal and whispered speech [J]. The Journal of the Acoustical Society of America, 2013, 133(5):3606-3606.
[2] Ruggles D, Riddell A, Freyman R L, et al. Intelligibility of voiced and whispered speech in noise in listeners with and without musical training [C]//Proceedings of Meetings on Acoustic. Montreal, Canada, 2013: 50-64.
[3] Sarria-Paja M, Falk T H. Whispered speech detection in noise using auditory-inspired modulation spectrum features [J]. IEEE Signal Processing Letters, 2013, 20(8):783-786.
[4] Loizou P. Speech enhancement: theory and practice [M]. New York: CRC, 2007.
[5] Hu Y, Loizou P. A comparative intelligibility study of single-microphone noise reduction algorithms [J]. The Journal of the Acoustical Society of America, 2007, 122(3):1777-1786.
[6] Li J, Yang L, Zhang J, et al. Comparative intelligibility investigation of single-channel noise-reduction algorithms for Chinese, Japanese, and English [J]. The Journal of the Acoustical Society of America, 2011, 129(5):3291-3301.
[7] Loizou P, Kim G. Reasons why current speech-enhancement algorithms do not improve speech intelligibility and suggested solutions[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(1):47-56.
[8] Wang D, Kjems U, Pedersen M, et al. Speech intelligibility in background noise with ideal binary time-frequency masking[J]. The Journal of the Acoustical Society of America, 2009, 125(4): 2336-2347.
[9] Tao L, Kwan H. Multirate-based fast parallel algorithms for 2-D DHT-based real-valued discrete Gabor transform [J]. IEEE Transactions on Image Processing, 2012, 21(7):3306-3311.
[10] Cohen I, Berdugo B. Speech enhancement for non-stationary noise environments [J]. Signal Processing, 2001, 81(11):2403-2418.
[11] Taal C, Hendriks R, Heusdens R, et al. An algorithm for intelligibility prediction of time-frequency weighted noisy speech [J]. IEEE Transactions on Audio, Speech, and Language Processing, 2011, 19(7):2125-2136.
[12] Ephraim Y, Malah D. Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator [J]. IEEE Transactions on Acoustics, Speech and Signal Processing, 1984, 32(6):1109-1121.
[13] Scalart P. Speech enhancement based on a priori signal to noise estimation [C]//Proceedings of Acoustics, Speech, and Signal Processing. Atlanta, USA, 1996: 629-632.

Memo

Memo:
Biographies: Zhou Jian(1981—), male, doctor, lecturer; Zhao Li(corresponding author), male, doctor, professor, zhaoli@seu.edu.cn.
Foundation items: The National Natural Science Foundation of China(No.61301295, 61273266, 61301219, 61201326, 61003131), the Natural Science Foundation of Anhui Province(No.1308085QF100, 1408085MF113), the Natural Science Foundation of Jiangsu Province(No.BK20130241), the Natural Science Foundation of Higher Education Institutions of Jiangsu Province(No.12KJB510021), the Doctoral Fund of Anhui University.
Citation: Zhou Jian, Wei Xin, Liang Ruiyu, et al. Intelligibility evaluation of enhanced whisper in joint time-frequency domain[J].Journal of Southeast University(English Edition), 2014, 30(3):261-266.[doi:10.3969/j.issn.1003-7985.2014.03.001]
Last Update: 2014-09-20