[1] Song P, Jin Y, Zha C, et al. Speech emotion recognition method based on hidden factor analysis[J]. Electronics Letters, 2014, 51(1): 112-114. DOI:10.1049/el.2014.3339.
[2] Schuller B, Zhang Z, Weninger F, et al. Synthesized speech for model training in cross-corpus recognition of human emotion[J]. International Journal of Speech Technology, 2012, 15(3): 313-323. DOI:10.1007/s10772-012-9158-0.
[3] Deng J, Zhang Z, Marchi E, et al. Sparse autoencoder-based feature transfer learning for speech emotion recognition[C]//IEEE 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction. Geneva, Switzerland, 2013: 511-516. DOI:10.1109/acii.2013.90.
[4] Jin Y, Song P, Zheng W, et al. Speaker-independent speech emotion recognition based on two-layer multiplekernel learning[J]. IEICE Transactions on Information and Systems, 2013, 96(10): 2286-2289. DOI:10.1587/transinf.e96.d.2286.
[5] Kalinli O, Narayanan S. Prominence detection using auditory attention cues and task-dependent high level information[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2009, 17(5): 1009-1024. DOI:10.1109/tasl.2009.2014795.
[6] Kalinli O. Syllable segmentation of continuous speech using auditory attention cues[C]//International Speech and Communication Association. Florence, Italy, 2011: 425-428.
[7] Wong W K, Zhao H T. Supervised optimal locality preserving projection[J]. Pattern Recognition, 2012, 45(1): 186-197. DOI:10.1016/j.patcog.2011.05.014.
[8] Yin Q, Qian S, Feng A. A fast refinement for adaptive Gaussian chirplet decomposition[J]. IEEE Transactions on Signal Processing, 2002, 50(6): 1298-1306. DOI:10.1109/tsp.2002.1003055.
[9] Bayram I. An analytic wavelet transform with a flexible time-frequency covering[J]. IEEE Transactions on Signal Processing, 2013, 61(5): 1131-1142. DOI:10.1109/tsp.2012.2232655.
[10] Noriega G. A neural model to study sensory abnormalities and multisensory effects in autism[J]. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2015, 23(2): 199-209. DOI:10.1109/TNSRE.2014.2363775.
[11] Khoubrouy S A, Panahi I M S, Hansen J H L. Howling detection in hearing aids based on generalized teager-kaiser operator[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2015, 23(1): 154-161. DOI:10.1109/taslp.2014.2377575.
[12] Ali S A, Khan A, Bashir N. Analyzing the impact of prosodic feature(pitch)on learning classifiers for speech emotion corpus[J]. International Journal of Information Technology and Computer Science, 2015, 7(2): 54-59.DOI:10.5815/ijitcs.2015.02.07.
[13] Ajmera P K, Jadhav D V, Holambe R S. Text-independent speaker identification using radon and discrete cosine transforms based features from speech spectrogram[J]. Pattern Recognition, 2011, 44(10): 2749-2759. DOI:10.1016/j.patcog.2011.04.009.
[14] Burkhardt F, Paeschke A, Rolfes M, et al. A database of german emotional speech[C]//International Speech and Communication Association. Lisbon, Portugal, 2005: 1517-1520.
[15] Martin O, Kotsia I, Macq B, et al. The enterface’05 audio-visual emotion database[C]//IEEE 22nd International Conference on Data Engineering Workshops. San Francisco, CA, USA, 2006: 8-10. DOI:10.1109/icdew.2006.145.
[16] Schuller B, Steidl S, Batliner A, et al. The interspeech 2010 paralinguistic challenge: Deception, sincerity and native language [C]//International Speech and Communication Association. Chiba, Japan, 2010: 2794-2797. DOI:10.21437/interspeech.2016-129.
[17] Eyben F, Wöllmer M, Schuller B. Opensmile: The munich versatile and fast open-source audio feature extractor[C]//Proceedings of the International Conference on Multimedia. Firenze, Italy, 2010: 1459-1462.
[18] Moustakidis S, Mallinis G, Koutsias N, et al. SVM-based fuzzy decision trees for classification of high spatial resolution remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2012, 50(1): 149-169. DOI:10.1109/TGRS.2011.2159726.
[19] Kim E H, Hyun K H, Kim S H, et al. Improved emotion recognition with a novel speaker-independent feature[J]. IEEE/ASME Transactions on Mechatronics, 2009, 14(3): 317-325.