|Table of Contents|

[1] Lu Yansheng, Hu Rong, Zou Lei, Zhou Chong, et al. Mining maximal pattern-based subspace clustersin high dimensional space [J]. Journal of Southeast University (English Edition), 2006, 22 (4): 490-495. [doi:10.3969/j.issn.1003-7985.2006.04.010]
Copy

Mining maximal pattern-based subspace clustersin high dimensional space()
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
22
Issue:
2006 4
Page:
490-495
Research Field:
Computer Science and Engineering
Publishing date:
2006-12-30

Info

Title:
Mining maximal pattern-based subspace clustersin high dimensional space
Author(s):
Lu Yansheng Hu Rong Zou Lei Zhou Chong
School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
Keywords:
subspace clustering pattern similarity maximal pattern-based subspace clusters
PACS:
TP311
DOI:
10.3969/j.issn.1003-7985.2006.04.010
Abstract:
The problem of pattern-based subspace clustering, a special type of subspace clustering that uses pattern similarity as a measure of similarity, is studied.Unlike most traditional clustering algorithms that group the close values of objects in all the dimensions or a set of dimensions, clustering by pattern similarity shows an interesting pattern, where objects exhibit a coherent pattern of rise and fall in subspaces.A novel approach, named EMaPle to mine the maximal pattern-based subspace clusters, is designed.The EMaPle searches clusters only in the attribute enumeration spaces which are relatively few compared to the large number of row combinations in the typical datasets, and it exploits novel pruning techniques.EMaPle can find the clusters satisfying coherent constraints, size constraints and sign constraints neglected in MaPle.Both synthetic data sets and real data sets are used to evaluate EMaPle and demonstrate that it is more effective and scalable than MaPle.

References:

[1] Han J, Kamber M.Data mining:concepts and techniques[M].San Francisco:Morgan Kaufmann, 2001.
[2] Cheng Y, Church G M.Biclustering of expression data [A].In:Proceedings of the 8th International Conference on Intelligent System for Molecular Biology[C].San Diego, CA, 2000.93-103.
[3] Han J, Ng R T.Efficient and effective clustering method for spatial data mining [A].In:Proceedings of the 8th International Conference on Very Large Data Bases[C].Santiago, Chile, 1994.144-155.
[4] Ester M, Kriegel H P, Sander J, et al.A density-based algorithm for discovering clusters in large spatial databases with noise [A].In:Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining[C].Portland, Oregon, 1996.226-231.
[5] Guha S, Rastogi R, Shim K.CURE:an efficient clustering algorithm for large databases.[A].In:Proceedings of ACM SIGMOD International Conference on Management of Data[C].Seattle, USA, 1998.73-84.
[6] Beyer K, Goldstein J, Ramakrishnan R, et al.When is “nearest neighbor” meaningful [A].In:Proceedings of the 7th International Conference on Database Theory[C].Jerusalem, Israel, 1999.217-235.
[7] Agrawal R, Gehrke J, Gunopulos D, et al.Automatic subspace clustering of high dimensional data for data mining applications[A].In:Proceedings of the ACM SIGMOD International Conference Management of Data[C].Seattle, USA, 1998.94-105.
[8] Aggarwal C C, Procopiuc C, Wolf J L, et al.Fast algorithms for projected clustering [A].In:Proceedings of the ACM SIGMOD International Conference on Management of Data[C].Pennsylvania, USA, 1999.61-72.
[9] Aggarwal C C, Yu P S.Finding generalized projected clusters in high dimensional spaces [A].In:Proceedings of the ACM SIGMOD International Conference on Management of Data[C].Dallas, Texas, 2000.70-81.
[10] Goil S, Nagesh H, Choudhary A.Mafia:efficient and scalable subspace clustering for very large data sets[R].Evanston:Northwestern University, 1999.
[11] Yang J, Wang W, Wang H, et al.Delta-clusters:capturing subspace correlation in a large data set [A].In:Proceedings of the 18th International Conference on Data Engineering [C].San Jose, CA, 2002.517-528.
[12] Wang H, Wang W, Yang J, et al.Clustering by pattern similarity in large data sets [A].In:Proceedings of the ACM SIGMOD International Conference on Management of Data[C].Madison, Wisconsin, 2002.394-405.
[13] Pei J, Zhang X, Cho M, et al.Maple:a fast algorithm for maximal pattern-based clustering [A].In:Proceedings of the Third IEEE International Conference on Data Mining[C].Florida, USA, 2003.259-266.

Memo

Memo:
Biography: Lu Yansheng(1949—), male, professor, lys@mail.hust.edu.cn.
Last Update: 2006-12-20