科技报告

【摘要】

This paper promotes a new task for supervised machine learning research: quantification--the pursuit of learning methods for accurately estimating the class distribution of a test set, with no concern for predictions on individual cases. A variant for cost quantification addresses the need to total up costs according to categories predicted by imperfect classifiers. These tasks cover a large and important family of applications that measure trends over time. The paper establishes a research methodology, and uses it to evaluate several proposed methods that involve selecting the classification threshold in a way that would spoil the accuracy of individual classifications. In empirical tests, Median Sweep methods show outstanding ability to estimate the class distribution, despite wide disparity in testing and training conditions. The paper addresses shifting class priors and costs, but not concept drift in general.

【预览】

附件列表
Files	Size	Format	View
RO201804100001538LZ	278KB	PDF	download


Quantifying Trends Accurately Despite Classifier Error and Class Imbalance

Forman, George
HP Development Company
关键词: classification; quantification; cost quantification; text mining;
RP-ID : HPL-2006-48R1
学科分类：计算机科学（综合）
美国\|英语
来源: HP Labs
PDF


	文献评价指标
	下载次数：7次	浏览次数：26次

【 摘 要 】

【 预 览 】

【摘要】

【预览】