期刊论文详细信息
Australasian Journal of Information Systems
Comparing sets of patterns with the Jaccard index
Sam Fletcher1  Md Zahidul Islam1 
[1] Charles Sturt University
关键词: Machine Learning;    Metrics;    Data Mining;    Patterns;    Rules;    Utility Measures;    Quality Evaluation;   
DOI  :  10.3127/ajis.v22i0.1538
学科分类:计算机科学(综合)
来源: University of Canberra * Faculty of Information Sciences and Engineering
PDF
【 摘 要 】

The ability to extract knowledge from data has been the driving force of Data Mining since its inception, and of statistical modeling long before even that. Actionable knowledge often takes the form of patterns, where a set of antecedents can be used to infer a consequent. In this paper we offer a solution to the problem of comparing different sets of patterns. Our solution allows comparisons between sets of patterns that were derived from different techniques (such as different classification algorithms), or made from different samples of data (such as temporal data or data perturbed for privacy reasons). We propose using the Jaccard index to measure the similarity between sets of patterns by converting each pattern into a single element within the set. Our measure focuses on providing conceptual simplicity, computational simplicity, interpretability, and wide applicability. The results of this measure are compared to prediction accuracy in the context of a real-world data mining scenario.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO201902194345455ZK.pdf 419KB PDF download
  文献评价指标  
  下载次数:20次 浏览次数:37次