|Table of Contents|

[1] Xiong Ying, Zhu Jie,. Feature study for improving Chinese overlapping ambiguityresolution based on SVM [J]. Journal of Southeast University (English Edition), 2007, 23 (2): 179-184. [doi:10.3969/j.issn.1003-7985.2007.02.006]
Copy

Feature study for improving Chinese overlapping ambiguityresolution based on SVM()
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
23
Issue:
2007 2
Page:
179-184
Research Field:
Computer Science and Engineering
Publishing date:
2007-06-30

Info

Title:
Feature study for improving Chinese overlapping ambiguityresolution based on SVM
Author(s):
Xiong Ying Zhu Jie
Department of Electronic Engineering, Shanghai Jiaotong University, Shanghai 200240, China
Keywords:
support vector machine Chinese overlapping ambiguity Chinese word segmentation word probability model
PACS:
TP391
DOI:
10.3969/j.issn.1003-7985.2007.02.006
Abstract:
In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors.First, four statistical parameters—mutual information, accessor variety, two-character word frequency and single-character word frequency are used to describe the feature vectors respectively.Then other parameters are tried to add as complementary features to the parameters which obtain the best results for further improving the classification performance.Experimental results show that features represented by mutual information, single-character word frequency and accessor variety can obtain an optimum result of 94.39%.Compared with a commonly used word probability model, the accuracy has been improved by 6.62%.Such comparative results confirm that the classification performance can be improved by feature selection and representation.

References:

[1] Liang Nanyuan.CDWS—the modern printed Chinese distinguishing word system[J].Journal of Chinese Information Processing, 1987, 1(2):44-52.(in Chinese)
[2] Sun Maosong, Zuo Zhengping, Tsou B K.The role of high frequent maximal crossing ambiguities in Chinese word segmentation[J].Journal of Chinese Information Processing, 1999, 13(1):27-34.(in Chinese)
[3] Sun Maosong, Zuo Zhengping, Huang Changning.Algorithm for solving 3-character crossing ambiguities in Chinese word segmentation[J].Journal of Tsinghua University Science and Technology, 1999, 39(5):101-103.(in Chinese)
[4] Li Rong, Liu Shaohui, Ye Shiwei, et al.A method of crossing ambiguities in Chinese word segmentation based on SVM and K-NN[J].Journal of Chinese Information Processing, 2001, 15(6):13-18.(in Chinese)
[5] Li Mu, Gao Jianfeng, Huang Changning, et al.Unsupervised training for overlapping ambiguity resolution in Chinese word segmentation[C]//Proceeding of the Second Sighan Workshop on Chinese Language Processing.Sapporo, 2003:1-7.
[6] Zhang Feng, Fan Xiaozhong.Resolution of overlapping ambiguity strings based on maximum entropy model[J].Transactions of Beijing Institute of Technology, 2005, 25(7):590-593.(in Chinese)
[7] Vapnik V N.The nature of statistical learning theory [M].New York:Springer Verlag, 1995:1-188.
[8] Sun Maosong, Huang Changning, Tsou B K, et al.Using character bigram for ambiguity resolution in Chinese word segmentation[J].Computer Research and Development, 1997, 34(5):332-339.(in Chinese)
[9] Feng Haodi, Chen Kang, Kit Chunyu, et al.Unsupervised segmentation of Chinese corpus using accessor variety [C]//Lecture Notes in Artificial Intelligence.Berlin:Springer Verlag, 2005, 3248:694-703.
[10] Feng Haodi, Chen Kang, Deng Xiaotie, et al.Accessor variety criteria for Chinese word extraction [J].Computational Linguistics, 2004, 30(1):75-93.
[11] McEnery Tony, Xiao Richard.The lancaster corpus of Mandarin Chinese[EB/OL].(2004-09-15)[2005-12-07].http://bowland-files.lancs.ac.uk/corplang/lcmc/.
[12] Chang Chih-Chung, Lin Chih-Jen.LIBSVM—a library for support vector machines[EB/OL].(2005-11-30)[2006-04-01].http://www.csie.ntu.edu.tw/~cjlin/libsvm.

Memo

Memo:
Biographies: Xiong Ying(1977—), female, graduate;Zhu Jie(corresponding author), male, doctor, professor, zhujie@sjtu.edu.cn.
Last Update: 2007-06-20