[1] Stylianou Y, Cappé O, Moulines E. Continuous probabilistic transform for voice conversion [J]. IEEE Transactions on Speech and Audio Processing, 1998, 6(2):131-142.
[2] Kain A, Macon M W. Spectral voice conversion for text-to-speech synthesis [C]//International Conference on Acoustics, Speech, and Signal Processing. Seattle, USA, 1998: 285-288.
[3] Inanoglu Z. Transforming pitch in a voice conversion framework [D]. Cambridge, UK: St.Edmund’s College of the University of Cambridge, 2003: 28-32.
[4] Wu Z Z, Kinnunen T, Chng E S, et al. Text-independent F0 transformation with non-parallel data for voice conversion [C]//11th Annual Conference of the International Speech Communication Association. Makuhari, Japan, 2010: 1732-1735.
[5] Shao X, Milner B. Pitch prediction from MFCC vectors for speech reconstruction [C]//Proceedings of the International Conference on Acoustics, Speech, and Signal Processing. Montreal, Canada, 2004: 97-100.
[6] Basak D, Pal S, Patranabis D C. Support vector regression [J]. Neural Information Processing—Letters and Reviews, 2007, 11(10): 203-224.
[7] Song P, Bao Y Q, Zhao L, et al. Voice conversion using support vector regression [J]. Electronics Letters, 2011, 47(18): 1045-1046.
[8] Hwang H, Haddad R A. Adaptive median filters: new algorithms and results [J]. IEEE Transactions on Image Processing, 1995, 4(4): 499-502.
[9] Kominek J, Black A W. The CMU Arctic speech databases [C]//Proceedings of the 5th ISCA Speech Synthesis Workshop. Pittsburgh, USA, 2004: 223-224.
[10] Kawahara H, Masuda-Katsuse I, de Cheveigné A. Restructuring speech representation using pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: possible role of a repetitive structure in sounds [J]. Speech Communication, 1999, 27(3): 187-207.