科技报告详细信息
Training Set Compression by Incremental Clustering
Li, Dalong ; Simske, Steven
HP Development Company
关键词: Clustering;    Support vector machine;    KNN;    Pattern recognition;    CONDENSE.;   
RP-ID  :  HPL-2011-25
学科分类:计算机科学(综合)
美国|英语
来源: HP Labs
PDF
【 摘 要 】
Compression of training sets is a technique for reducing training set size without degrading classification accuracy. By reducing the size of a training set, training will be more efficient in addition to saving storage space. In this paper, an incremental clustering algorithm, the Leader algorithm, is used to reduce the size of a training set by effectively subsampling the training set. Experiments on several standard data sets using SVM and KNN as classifiers indicate that the proposed method is more efficient than CONDENSE in reducing the size of training set without degrading the classification accuracy. While the compression ratio for the CONDENSE method is fixed, the proposed method offers variable compression ratio through the cluster threshold value.
【 预 览 】
附件列表
Files Size Format View
RO201804100002917LZ 105KB PDF download
  文献评价指标  
  下载次数:15次 浏览次数:82次