|Table of Contents|

[1] Wang Tao, Lu Yansheng,. Mining condensed frequent subtree base [J]. Journal of Southeast University (English Edition), 2006, 22 (1): 48-53. [doi:10.3969/j.issn.1003-7985.2006.01.011]
Copy

Mining condensed frequent subtree base()
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
22
Issue:
2006 1
Page:
48-53
Research Field:
Computer Science and Engineering
Publishing date:
2006-03-20

Info

Title:
Mining condensed frequent subtree base
Author(s):
Wang Tao Lu Yansheng
College of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
Keywords:
data mining tree pattern condensed subtree base
PACS:
TP311
DOI:
10.3969/j.issn.1003-7985.2006.01.011
Abstract:
In frequent tree pattern mining, the number of frequent subtrees generated is often too large.To tackle this problem, the concept of condensed frequent subtree base is proposed.The base consists of the maximal frequent subtrees for a series of support thresholds.It is a subset of frequent subtrees, and is used to approximate the support of arbitrary frequent subtrees with guaranteed maximal error bound.In addition, an algorithm is developed to mine such a condensed subtree base in a database of labeled rooted ordered trees.The algorithm adopts the way of right-most extension to generate systematically all frequent rooted ordered subtrees.Several techniques are proposed to prune the branches that do not correspond to the maximal frequent subtrees.Heuristic techniques are used to arrange the order of computation so that relatively expensive computation is avoided as much as possible.Experimental results show that the size of the base is less than 10% of that of the complete set, and the algorithm outperforms the previous algorithms.

References:

[1] Zaki M J.Efficiently mining frequent trees in a forest [C]//8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Alberta, Canada, 2002: 68-75.
[2] Asai T, Abe K, Kawasoe S, et al.Efficient substructure discovery from large semi-structured data [C]//Proc 2002 SIAM Int Conf on Data Mining (SDM’02).Arlington, VA, 2002: 457-473.
[3] Pei J, Dong G, Zou W, et al.On computing condensed frequent pattern bases[C]//Proc 2002 Int Conf on Data Mining (ICDM’02).Maebashi, Japan, 2002: 378-385.
[4] Wang C, Hong M, Pei J, et al.Efficient pattern-growth methods for frequent tree pattern mining [C]//The Eighth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’04).Sydney, Australia, 2004: 441-451.
[5] Chi Y, Yang Y, Xia Y, et al.CMTreeMiner:mining both closed and maximal frequent subtrees [C]//The Eighth Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD’04).Sydney, Australia, 2004: 63-73.
[6] Kudo T.FREQT:an implementation of FREQT [EB/OL].(2003-03-21)[2004-12-15].http://chasen.org/~taku/software/freqt/.

Memo

Memo:
Biographies: Wang Tao(1969—), female, graduate;Lu Yansheng(corresponding author), male, professor, lys@mail.hust.edu.cn.
Last Update: 2006-03-20