|Table of Contents|

[1] Xia Shixiong, Li Wenchao, Zhou Yong, Zhang Lei, et al. Improved k-means clustering algorithm [J]. Journal of Southeast University (English Edition), 2007, 23 (3): 435-438. [doi:10.3969/j.issn.1003-7985.2007.03.027]
Copy

Improved k-means clustering algorithm()
Share:

Journal of Southeast University (English Edition)[ISSN:1003-7985/CN:32-1325/N]

Volumn:
23
Issue:
2007 3
Page:
435-438
Research Field:
Automation
Publishing date:
2007-09-30

Info

Title:
Improved k-means clustering algorithm
Author(s):
Xia Shixiong Li Wenchao Zhou Yong Zhang Lei Niu Qiang
School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221008, China
Keywords:
clustering k-means algorithm silhouette coefficient
PACS:
TP18
DOI:
10.3969/j.issn.1003-7985.2007.03.027
Abstract:
In allusion to the disadvantage of having to obtain the number of clusters of data sets in advance and the sensitivity to selecting initial clustering centers in the k-means algorithm, an improved k-means clustering algorithm is proposed.First, the concept of a silhouette coefficient is introduced, and the optimal clustering number Kopt of a data set with unknown class information is confirmed by calculating the silhouette coefficient of objects in clusters under different K values.Then the distribution of the data set is obtained through hierarchical clustering and the initial clustering-centers are confirmed.Finally, the clustering is completed by the traditional k-means clustering.By the theoretical analysis, it is proved that the improved k-means clustering algorithm has proper computational complexity.The experimental results of IRIS testing data set show that the algorithm can distinguish different clusters reasonably and recognize the outliers efficiently, and the entropy generated by the algorithm is lower.

References:

[1] Han Jiawei, Kamber Micheline.Data mining concepts and techniques[M].2nd ed.Beijing:China Machine Press, 2001.(in Chinese)
[2] Xu Rui, Wunsch Ⅱ Donald.Survey of clustering algorithms[J].IEEE Transactions on Neural Networks, 2005, 16(3):634-678.
[3] Su Ting, Dy Jennifer.A deterministic method for initializing k-means clustering[C]//Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2004).Boca Raton, FL, USA, 2004:784-786.
[4] Kanungo Tapas, Mount David M, Netanyahu Nathan S, et al.An efficient k-means clustering algorithm:analysis and implementation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7):881-892.
[5] Ramze R M, Lelieveldt B P F, Reiber J H C.A new cluster validity index for the fuzzy c-mean[J].Pattern Recognition Letters, 1998, 19(3/4):237-246.
[6] Fisher R A.Iris plants database[EB/OL].(1988-07)[2007-04-30].http://www.ics.uci.edu/~mlearn/MLRepository.html.

Memo

Memo:
Biography: Xia Shixiong(1961—), male, professor, xiasx@cumt.edu.cn.
Last Update: 2007-09-20