期刊论文详细信息
Applied Sciences
Efficient High-Dimensional Kernel k-Means++ with Random Projection
Jan Y. K. Chan1  Alex Po Leung1  Yunbo Xie1 
[1] Faculty of Information Technology, Macau University of Science and Technology, Taipa 999078, China;
关键词: kernel k-means;    k-means++;    random projection;    dimensionality reduction;    high dimensional data;   
DOI  :  10.3390/app11156963
来源: DOAJ
【 摘 要 】

Using random projection, a method to speed up both kernel k-means and centroid initialization with k-means++ is proposed. We approximate the kernel matrix and distances in a lower-dimensional space Rd before the kernel k-means clustering motivated by upper error bounds. With random projections, previous work on bounds for dot products and an improved bound for kernel methods are considered for kernel k-means. The complexities for both kernel k-means with Lloyd’s algorithm and centroid initialization with k-means++ are known to be O(nkD) and Θ(nkD), respectively, with n being the number of data points, the dimensionality of input feature vectors D and the number of clusters k. The proposed method reduces the computational complexity for the kernel computation of kernel k-means from O(n2D) to O(n2d) and the subsequent computation for k-means with Lloyd’s algorithm and centroid initialization from O(nkD) to O(nkd). Our experiments demonstrate that the speed-up of the clustering method with reduced dimensionality d=200 is 2 to 26 times with very little performance degradation (less than one percent) in general.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次