期刊论文详细信息
Journal of computing and information technology
A Simple Density with Distance Based Initial Seed Selection Technique for K Means Algorithm
Desikan, Kalyani1  Syed Azimuddin, Sajidha2 
[1] Department of mathematics, School of Advanced Sciences, VIT, Chennai, India;School of Computing Science and Engineering, VIT, Chennai, India
关键词: Computer science;    Information Systems;   
DOI  :  10.20532/cit.2017.1003605
学科分类:计算机科学(综合)
来源: Sveuciliste u Zagrebu
PDF
【 摘 要 】

Open issues with respect to K means algorithm are identifying the number of clusters, initial seed concept selection, clustering tendency, handling empty clusters, identifying outliers etc. In this paper we propose a novel and a simple technique considering both density and distance of the concepts in a dataset to identify initial seed concepts for clustering. Many authors have proposed different techniques to identify initial seed concepts; but our method ensures that the initial seed concepts are chosen from different clusters that are to be generated by the clustering solution. The hallmark of our algorithm is that it is a single pass algorithm that does not require any extra parameters to be estimated. Further, our seed concepts are one among the actual concepts and not the mean of representative concepts as is the case in many other algorithms. We have implemented our proposed algorithm and compared the results with the interval based technique of Fouad Khan. We see that our method outperforms the interval based method. We have also compared our method with the original random K means and K Means++ algorithms.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO201902195562847ZK.pdf 378KB PDF download
  文献评价指标  
  下载次数:16次 浏览次数:32次