K-means+: an autonomous clustering algorithm

Download	View final version: K-means+: an autonomous clustering algorithm (PDF, 617 KiB)
DOI	Resolve DOI: https://doi.org/10.13140/RG.2.1.1113.7365
Author	Search for: Guan, Yu¹; Search for: Ghorbani, Ali A.; Search for: Belacel, N.¹
Affiliation	National Research Council of Canada. NRC Institute for Information Technology
Format	Text, Technical Report
Physical description	27 p.
Abstract	The traditional clustering algorithm, K-means, is famous for its simplicity and low time complexity. However, the usability of K-means is limited by its shortcoming that the clustering result is heavily dependent on the user-defined variants, i.e., the selection of the initial centroid seeds and the number of clusters (k). A new clustering algorithm, called K-means+, is proposed to extend K-means. The K-means+ algorithm can automatically determine a semi-optimal number of clusters according to the statistical nature of data; moreover, the initial centroid seeds are not critical to the clustering results. The experiment results on the Iris and the KDD-99 data illustrate the robustness of the K-means+ clustering algorithm, especially for a large amount of data in a high-dimensional space.
Publication date	2003
Language	English
NPARC number	21277103
Export citation	Export as RIS
Report a correction	Report a correction (opens in a new tab)
Record identifier	33a96c39-4891-4815-bf19-1135a8984501
Record created	2015-12-01
Record modified	2020-06-02