期刊论文详细信息
IEEE Access
Adaptive Density-Based Spatial Clustering for Massive Data Analysis
Zihao Cai1  Jian Wang1  Kejing He1 
[1] School of Computer Science and Engineering, South China University of Technology, Guangzhou, China;
关键词: Clustering;    density-based algorithms;    linear connection;    data block splitter;    data block merger;   
DOI  :  10.1109/ACCESS.2020.2969440
来源: DOAJ
【 摘 要 】

Clustering is a classical research field due to its broad applications in data mining such as emotion detection, event extraction and topic discovery. It aims to discover intrinsic patterns which can be formed as clusters from a collection of data. Significant progress have been made by the Density-based Spatial Clustering of Applications with Noise (DBSCAN) and its variants. However, there is a major limitation that current density-based algorithms suffer from linear connection problem, where they perform poorly to discriminate objective clusters which are “connected” by a few data points. Moreover, the parameter setting and the time cost make it hard to be well-adapted in massive data analysis. To address these problems, we propose a novel adaptive density-based spatial clustering algorithm called Ada-DBSCAN, which consists of a data block splitter and a data block merger, coordinated by local clustering and global clustering. We conduct extensive experiments on both artificial and real-world datasets to evaluate the effectiveness of Ada-DBSCAN. Experimental results show that our algorithm evidently outperforms several strong baselines in both clustering accuracy and human evaluation. Besides, Ada-DBSCAN shows significant improvement of efficiency compared with DBSCAN.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次