会议论文详细信息
2nd Annual International Conference on Information System and Artificial Intelligence
A pruning algorithm for Meta-blocking based on cumulative weight
物理学;计算机科学
Zhang, Fulin^1 ; Gao, Zhipeng^1 ; Niu, Kun^2
State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing
100876, China^1
School of Software Engineering, Beijing University of Posts and Telecommunications, Beijing
100876, China^2
关键词: Blocking method;    Blocking process;    Cumulative weight;    Entity resolutions;    Heterogeneous information;    Large datasets;    Pruning algorithms;    Quadratic complexity;   
Others  :  https://iopscience.iop.org/article/10.1088/1742-6596/887/1/012058/pdf
DOI  :  10.1088/1742-6596/887/1/012058
学科分类:计算机科学(综合)
来源: IOP
PDF
【 摘 要 】

Entity Resolution is an important process in data cleaning and data integration. It usually employs a blocking method to avoid the quadratic complexity work when scales to large data sets. Meta-blocking can perform better in the context of highly heterogeneous information spaces. Yet, its precision and efficiency still have room to improve. In this paper, we present a new pruning algorithm for Meta-Blocking. It can achieve a higher precision than the existing WEP algorithm at a small cost of recall. In addition, can reduce the runtime of the blocking process. We evaluate our proposed method over five real-world data sets.

【 预 览 】
附件列表
Files Size Format View
A pruning algorithm for Meta-blocking based on cumulative weight 468KB PDF download
  文献评价指标  
  下载次数:18次 浏览次数:48次