|Table of Contents|

[1] Guo Yuqin, Yuan Fang, Liu Haibo, et al. Text categorization based on fuzzy classification rules tree [J]. Journal of Southeast University (English Edition), 2008, 24 (3): 339-342. [doi:10.3969/j.issn.1003-7985.2008.03.021]
Copy

Text categorization based on fuzzy classification rules tree()
基于模糊分类规则树的文本分类
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
24
Issue:
2008 3
Page:
339-342
Research Field:
Computer Science and Engineering
Publishing date:
2008-09-30

Info

Title:
Text categorization based on fuzzy classification rules tree
基于模糊分类规则树的文本分类
Author(s):
Guo Yuqin1 2 Yuan Fang1 Liu Haibo1
1 College of Mathematics and Computer Science, Hebei University, Baoding 071002, China
2 Tianjin Branch of the People’s Bank of China, Tianjin 300040, China
郭玉琴1 2 袁方1 刘海博1
1河北大学数学与计算机学院, 保定071002; 2 中国人民银行天津分行, 天津 300040
Keywords:
text categorization fuzzy classification association rule classification rules tree fuzzy classification rules tree
文本分类 模糊分类关联规则 分类规则树 模糊分类规则树
PACS:
TP393
DOI:
10.3969/j.issn.1003-7985.2008.03.021
Abstract:
To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts, which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules, the fuzzy classification rules contain not only words, but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore, the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval, more k-FCR-trees are built.When classifying a new text, it is not necessary to search the paths of the sub-trees led by those words not appearing in this text, thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.
针对传统的基于关联规则的文本分类方法在分类文本时需要遍历分类器中的所有规则, 分类效率非常低的问题, 提出一种基于模糊分类规则树(FCR-tree)的文本分类方法.分类器中的规则以树的形式存储, 由于树型结构避免了重复结点的存储, 节省了存储空间.模糊分类关联规则与一般分类规则相比, 不仅包含了词条信息, 还包含了词条出现频度对应的模糊集, 所以FCR-tree的构建过程及树的结构不同于一般规则树CR-tree.为降低构建及遍历FCR-tree的难度, 采用了构造多棵k-FCR-tree的方法.在搜索规则树时, 如果结点中的词条没在待分类文本中出现, 则不需要再搜索该结点引导的子树, 大大减少了需要匹配的规则的数量.实验表明该方法是可行的, 与遍历分类器的分类方法相比, 分类效率有了明显提高.

References:

[1] Wang Yuanzhen, Qian Tieyun, Feng Xiaonian.Association rules based automatic Chinese text categorization [J].Mini-Micro Systems, 2005, 26(8):1380-1383.(in Chinese)
[2] Antonie M L, Zaiane O R.Text document categorization by term association [C]//Proc of the IEEE International Conference on Data Mining(ICDM’02).Maebashi City, Japan, 2002:19-26.
[3] Yuan Fang, Guo Yuqin, Yang Liu, et al.Chinese text categorization based on fuzzy association rules [C]//Proc of International Conference on Machine Learning and Cybernetics. Dalian, China, 2006:1030-1035.
[4] Li Wenmin, Han Jiawei, Pei Jian.CMAR:accurate and efficient classification based on multiple class-association rules [C]//Proc of the IEEE International Conference on Data Mining(ICDM’01). San Jose, CA, USA, 2001:369-376.
[5] Chen Xiaoyun, Chen Yi, Wang Lei, et al.Text categorization based on classification rules tree by frequent patterns[J].Journal of Software, 2006, 17(5):1017-1025.(in Chinese)
[6] Song Yuqing, Wang Lijun, Lü Ying, et al.Efficient association rule mining algorithm based on classification tree [J].Journal of Jiangsu University: Natural Science Edition, 2006, 27(1):51-54.(in Chinese)

Memo

Memo:
Biographies: Guo Yuqin(1981—), female, graduate;Yuan Fang(corresponding author), male, doctor, professor, yuanfang@hbu.cn.
Foundation items: The National Natural Science Foundation of China(No.60473045), the Technology Research Project of Hebei Province(No.05213573), the Research Plan of Education Office of Hebei Province(No.2004406).
Citation: Guo Yuqin, Yuan Fang, Liu Haibo.Text categorization based on fuzzy classification rules tree[J].Journal of Southeast University(English Edition), 2008, 24(3):339-342.
Last Update: 2008-09-20