Efficient design method of non-uniform cosine modulated filter bank for digital hearing aids

(1School of Information Science and Engineering, Southeast University, Nanjing 210096, China)(2School of Communication Engineering, Institute of Nanjing Technology, Nanjing 211167, China)(3School of Mechanical and Electric Engineering, Guangzhou University, Guangzhou 510006 China)

Abstract：To promote the performance of the traditional multi-channel filter bank which leads to speech quality degradation, an efficient design method of the non-uniform cosine modulated filter bank (CMFB) based on the audiogram for digital hearing aids is proposed. First, a low-pass prototype filter is designed by the linear iterative algorithm. Secondly, the uniform CMFB is achieved on the basis of the principle formulas. Then, the adjacent channels of a uniform filter bank which have low or gradual slopes are merged according to the trend of audiogram of the hearing impaired person. Finally, the corresponding non-uniform CMFB is obtained. Simulation results show that the signal processed by the proposed filter bank is similar to the original signal in a time-domain waveform and spectrogram without significant distortion or difference. The speech quality results show that the personal evaluation of speech quality (PESQ) of non-uniform CMFB is 35% higher than that of the traditional design, and the hearing-aid speech quality index (HASQI) increases by about 40%.

Key words：digital hearing aids; cosine modulated filter bank (CMFB); non-uniform filter bank; speech quality

Hearing aids are devices designed to improve the quality of hearing for hearing impaired persons. With the development of signal processing technology, people tend to pay more attention to the speech processing algorithms of digital hearing aids[1-2]. Filters in digital hearing aids have evolved from a single-channel filter to multi-channel filter banks due to the fact that the sensitivity of hearing loss patients to sound varies with the frequency, and the multi-channel filter bank is designed to achieve selective amplification[3].

Currently, a number of non-uniform filter banks have been shown to have better performance than the widely used uniform multi-channel filter banks[4-5]. Some reconfigurable digital filter banks were proposed by Wei et al[6-7]. The diverse subband filters can be obtained by setting different control parameters without altering the structure of the filter bank, and the computational complexity is reduced by utilizing advanced signal processing algorithms. A variable bandwidth filter bank for digital hearing aids using the Farrow structure was introduced for the purpose of better audiogram matching with less matching error[8]. However, many filter banks ignore the specific conditions of hearing loss patients in the designing process, and the suitable parameters require a series of calculations.

The audiogram is the graph that depicts the frequency versus sound intensity, which is the direct basis to implement hearing aid fitting and audition investigation. A reconfigurable non-uniform cosine modulated filter bank (CMFB) based on the audiogram is proposed in this paper. The linear iterative algorithm is applied in the design procedure to obtain the required low-pass prototype filter, and other subband filters can be obtained by cosine modulated formulas. Then, according to the trend of the audiogram of hearing impaired persons, the adjacent channels of uniform filter banks which have low or gradual slopes are merged to obtain the corresponding non-uniform CMFB. The advantages of the proposed structure are simple design procedures, high flexibility in tuning the bandwidth of sub-bands, high stop band attenuation and small system distortion.

1 Design of the Improved CMFB

1.1 Multi-channel filter bank

Nowadays, the digital hearing aid is the popular hearing aid with the function of multi-channel loudness compensation. Fig.1 shows the basic structure of multi-channel loudness compensation. The key step is to design a filter bank with multiple amplitude responses[9].

The basic structure of the M-channel filter bank is shown in Fig.2, which is called the analysis and synthesis filter bank. Hk(z) and Gk(z) (k=0,1,…,M-1) represent the analysis filter bank and synthesis filter bank, respectively. First, the input signal x(n) is decomposed into a set of subband signals when it passes through the analysis filter bank, and then some related algorithms are applied to processing those segment signals. Processed signals can be used to reconstruct the initial signal by the synthesis filter bank.

1.2 Design of non-uniform CMFB

The main design work of the uniform cosine modulated filter bank is to design a low-pass prototype filter, because the analysis and synthesis filter bank can be obtained by the means of cosine modulation. The design of the low-pass prototype filter takes advantage of the linear iteration technique, converting the filter design into polynomial approximation and minimizing the maximum error in the passband and stopband by iteration[10].

Suppose that Hk(z) and Gk(z)are the analysis and synthesis filters of a M-channel uniform CMFB, and the corresponding impulse responses are given by hk(n) and gk(n), respectively.

where hp(n) is the impulse response of the prototype filter; M is the number of channels of the uniform CMFB; and k=0,1,…,M-1.

A non-uniform CMFB is designed by merging the adjacent analysis and synthesis filters. The analysis filter

is considered, which is obtained by merging liadjacent analysis filters.

where ni represents the upper band edge

=m=M); li is the total number of channels to be combined; and

is the number of channels of the non-uniform CMFB. The synthesis filter

can be obtained by a similar method.

To eliminate the interchannel interference between adjacent channels, niare selected to be an integral multiple of li

-1).

2 Experiment and Analysis

2.1 Experiment setting

The test speech used in the experiment was selected from the Mandarin speech audiometry-monosyllable recognition rate test initiated by Xi et al[11]. The sampling rate of the test signal is 16 kHz; the sampling precision is 16 bit; and the speech duration is about 3.5 s.

The number of the channels of the uniform CMFB is equal to 16, and the bandwidth of each channel is 500 Hz. The hearing thresholds in six audiometric frequencies can be acquired in six types of hearing loss audiograms. The audiometric frequencies are 0.25, 0.5, 1, 2, 4, 6 and 8 kHz. Corresponding hearing threshold vectors are shown in Tab.1. Type 2 is taken as an example. The number of the channels of the corresponding non-uniform CMFB is 13. The 9th to 12th channels are merged into a new channel and the bandwidth is 2 kHz.

2.2 Comparison of speech spectrum

The Gammatone filter is a standard auditory filter which simulates the cochlear basilar membrane from Bionics. It is described as a combination of a Gamma function and tonal signal[12]. The time-domain waveforms of the original signal and signals processed by diverse filter banks are shown in Fig.3.

Figs.3(a), (b) and (c) are the time-domain waveforms of the original signal and two processed signals, while the inferior subgraphs (d), (e), (f) are spectro grams of the corresponding signals. It is observed that the time-domain waveform of the output signal of the CMFB is similar to that of the input signal without apparent change in amplitude, envelope and energy distribution of the spectrogram. The fidelity of the proposed filter bank is better than that of the Gammatone filter bank, and the processed signal preserves the original information effectively.

2.3 Comparison of speech quality

2.3.1 Personal evaluation of speech quality (PESQ)

PESQ is chosen to be an evaluation criterion to test the speech quality of the filter bank. A higher PESQ value indicates less difference between the initial and impaired speech.

Tab.2 shows the comparison of the PESQ value among four different filter banks: the uniform filter bank, the non-uniform filter bank, the Gammatone filter bank and the IIR filter bank[13]. It can be observed that the PESQ of the CMFB is 35% higher than those of the conventional filter banks, and the PESQ of the non-uniform CMFB is slightly higher than that of the uniform CMFB.

A new method to evaluate speech quality objectively was introduced to give the qualitative and quantitative analysis of the processed speech[14]. The speech intelligibility index HASPI and speech quality index HASQI represent system performance. The closer the value tends to one, the better the performance is. Specifically, the input parameters include a vector containing the hearing loss in six audiometric frequencies. It means that the result is significant for the hearing loss of patients.

Tab.3 shows the comparison of the HASPI and HASQI values of the three filter banks mentioned above when the hearing loss is not considered. It is observed that the HASPI values of three filter systems are nearly equal to 1, and that the HASQI of the non-uniform CMFB is clearly superior to those of traditional structures, increasing approximately by 40%.

When hearing loss is taken into account, hearing loss vectors need to be calculated by the hearing thresholds in six audiometric frequencies, as shown in Tab.1. The sound pressure level (SPL) in six audiometric frequencies of the input signal is given by a particular function, and the raw vector is [70, 60, 50, 45, 50, 40]. Tab.4 shows the speech quality indices of different filter banks which match the given audiogram. It is observed that the performance of the non-uniform filter bank is superior to the Gammatone filter bank and the IIR filter banks. The Toolkit counting the HASPI/HASQI alters the hearing loss vector in six audiometric frequencies according to hearing impaired persons. The audiogram of Type 2 is taken as an example. When the hearing loss is not included, the HASPI is equal to 0.999 9, and the HASQI is equal to 0.916 2; while the HASPI is equal to 0.995 7, and the HASQI is equal to 0.992 7 when taking the hearing loss vector into account. The speech intelligibility index remains a constant, but the quality index is improved greatly.

2.4 Comparison of computational complexity

The CMFB has simple design procedures and low computational complexity. The main design work is to design a low-pass prototype filter, and the analysis and synthesis filter bank can be provided by the means of cosine modulation. All required parameters are calculated in advance.

Tab.5 shows the coefficient calculation time of two traditional filter banks and the proposed filter bank. The numbers of channels are 16, and the calculation time is the average time of 100 operations. It is observed that the computational time of the CMFB is lower than those of the Gammatone filter bank and IIR filter bank, and the CMFB has better performance with less distortion.

3 Conclusion

A design method of non-uniform CMFB based on the audiogram is proposed in this paper. The adjacent channels which have low or gradual slopes are merged on the basis of the uniform CMFB, and the non-uniform filter system matches the audiogram better within the margin of error. According to the results of experiment, the speech quality of the non-uniform CMFB is better than the conventional Gammatone filter bank, with less distortion caused by the filter bank. The advantages of the proposed structure are simple, with less computational complexity and excellent performance. It can therefore satisfy diverse demands by adjusting the number of merging channels.

[1]Öberg M, Marcusson J, Nägga K, et al. Hearing difficulties, uptake, and outcomes of hearing aids in people 85 years of age [J]. International Journal of Audiology, 2012, 51(2):108-115. DOI:10.3109/14992027.2011.622301.

[2]Zhao L, Zhang X R, Liang R Y, et al. Review on certain key algorithms of digital hearing aids[J]. Journal of Data Acquisition and Processing, 2015, 30(2):252-265. (in Chinese)

[3]Zou C R, Liang R Y, Xie Y. Research progress and outlook of speech processing algorithms for digital hearing aids[J]. Journal of Data Acquisition and Processing, 2016, 31(2):242-251. (in Chinese)

[4]Liu C W, Chang K C, Chuang M H, et al. 10-ms 18-band quasi-ANSI S1.11 1/3-octave filter bank for digital hearing aids[J]. IEEE Transactions on Circuits and Systems Ⅰ: Regular Papers, 2013, 60(3):638-649. DOI:10.1109/tcsi.2012.2209731.

[5]Huang S, Tian L, Ma X, et al. A reconfigurable sound wave decomposition filterbank for hearing aids based on nonlinear transformation[J]. IEEE Transactions on Biomedical Circuits and Systems, 2016, 10(2),487-496. DOI:10.1109/TBCAS.2015.2436916.

[6]Wei Y, Liu D. A reconfigurable digital filter bank for hearing-aid systems with a variety of sound wave decomposition plans[J]. IEEE Transactions on Biomedical Engineering, 2013, 60(6):1628-1635. DOI:10.1109/TBME.2013.2240681.

[7]Wei Y, Wang Y. Design of low complexity adjustable. Filter bank for personalized hearing aid solutions[J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2015, 23(5):923-931. DOI:10.1109/taslp.2015.2409774.

[8]Haridas N, Elias E. Efficient variable bandwidth filters for digital hearing aid using Farrow structure[J]. Journal of Advanced Research, 2016, 7(2):255-262. DOI:10.1016/j.jare.2015.06.002.

[9]Chong K S, Gwee B H, Chang J S. A 16-channel low-power nonuniform spaced filter bank core for digital hearing aids[J]. IEEE Transactions on Circuits & Systems Ⅱ: Express Briefs, 2006, 53(9):853-857. DOI:10.1109/tcsii.2006.881821.

[10]Selesnick I W, Burrus C S. Exchange algorithms that complement the Parks-McClellan algorithm for linear-phase FIR filter design[J]. IEEE Transactions on Circuits and Systems Ⅱ: Analog and Digital Signal Processing, 1997, 44(2):137-143. DOI:10.1109/tcsii.2006.881821.

[11]Ji F, Xi X, Chen A T, et al. The equivalence study of Mandarin monosyllable lists[J]. Chinese Journal of Otology, 2008, 6(1):17-20.

[12]Cao L T, Li R W, Shi Y Q, et al. Loudness compensation method based on human auditory for digital hearing aids[C]//International Conference on Biomedical Engineering and Informatics. Dalian, China 2014:335-340.

[13]Chen H L, Cheng G G. Novel design of perfect reconstructed quadrature mirror IIR filter banks[J]. Computer Science, 2009, 36(8):92-84. (in Chinese)

[14]Kates J M, Arehart K H. The hearing-aid speech perception index (HASPI)[J]. Speech Communication, 2014, 65:75-93. DOI:10.1016/j.specom.2014.06.002.

数字助听器中非均匀余弦调制滤波器组的有效设计方法

(1东南大学信息科学与工程学院, 南京 210096)(2南京工程学院通信工程学院, 南京 211167)(3广州大学机械与工程学院, 广州 510006)

摘要：为改善多通道助听器滤波器分解导致语音质量下降的问题,提出了一种基于听力图的非均匀余弦调制滤波器组的有效设计方法．首先根据线性迭代算法设计低通原型滤波器,然后根据原理公式设计得到均匀余弦调制滤波器组,最后根据听损患者听力图将变化平缓的频率范围对应的均匀滤波器组若干通道合并为一个通道,得到对应的非均匀余弦调制滤波器组．波形实验结果显示信号经非均匀滤波器组后的时域波形与语谱图无明显失真,差异较小．语音质量实验结果显示所提出的非均匀余弦调制滤波器组的主观语音质量评估值PESQ比传统滤波器组高出约35%,语音质量指数HASQI提高约40%．

关键词：数字助听器;余弦调制滤波器组;非均匀滤波器组;语音质量

Foundation item：s：The National Natural Science Foundation of China (No.61375028, 61673108), China Postdoctoral Science Foundation (No.2016M601696), Qing Lan Project, the Program for Special Talent in Six Fields of Jiangsu Province (No.2016-DZXX-023), Jiangsu Planned Projects for Postdoctoral Research Funds (No.1601011B).

Citation：：Guo Ruxue, Zhao Li, Liang Ruiyu, et al. Efficient design method of non-uniform cosine modulated filter bank for digital hearing aids [J]．Journal of Southeast University (English Edition),2017,33(2):140-144．

Biographies：Guo Ruxue (1993—), female, graduate; Zhao Li (corresponding author), male, doctor, professor, zhaoli@seu.edu.cn.