会议论文详细信息
2018 3rd International Conference on Insulating Materials, Material Application and Electrical Engineering
Research on Efficient K_Means Parallel Algorithm Based on Hadoop Distributed Architecture
材料科学;无线电电子学;电工学
Qian, Lin^1 ; Wang, Lin^1 ; Mei, Zhu^1 ; Yu, Jun^1 ; Zhu, Guangxin^1 ; Song, Debing^1 ; Xu, Mingjie^1
State Grid Electric Power Research Institute (SGEPRI), Nanjing, China^1
关键词: Clustering accuracy;    Distributed architecture;    Improve performance;    Mapreduce frameworks;    Replacement policy;    Sample pretreatment;    Sampling efficiency;    Slow convergences;   
Others  :  https://iopscience.iop.org/article/10.1088/1757-899X/452/4/042066/pdf
DOI  :  10.1088/1757-899X/452/4/042066
学科分类:材料科学(综合)
来源: IOP
PDF
【 摘 要 】
Focusing on the problems of K-means algorithm that has high time complexity, slow convergence, lower clustering accuracy, slow operating speed, an efficient K-means parallel algorithm based on Hadoop system and MapReduce framework is proposed. Firstly, the algorithm uses K selective sorting algorithm to improve the sampling efficiency; Secondly, the iterative center is updated by using the weight replacement policy; finally, the initial center point is obtained based on the sample pretreatment strategy. Experimental results show that the proposed algorithm not only has good convergence, accuracy and speedup, but also can improve performance of the algorithm.
【 预 览 】
附件列表
Files Size Format View
Research on Efficient K_Means Parallel Algorithm Based on Hadoop Distributed Architecture 132KB PDF download
  文献评价指标  
  下载次数:9次 浏览次数:31次