科技报告详细信息
Adventures in Feature Selection on an Industrial Dataset... and Ensuing
Forman, George
HP Development Company
关键词: text feature selection;    text classification;    document categorization;    lessons learned;   
RP-ID  :  HPL-2012-161R1
学科分类:计算机科学(综合)
美国|英语
来源: HP Labs
PDF
【 摘 要 】

We relate the story of an interesting failure of text feature selection methods on an industrial dataset of technical documents. Our detailed dissection and ultimate understanding of the failure led to the creation of general solutions that not only solved the robustness problem we faced, but were also able to improve classification accuracy for simpler, public datasets, which was crucial to enable the works' publishability.

【 预 览 】
附件列表
Files Size Format View
RO201804100000164LZ 628KB PDF download
  文献评价指标  
  下载次数:10次 浏览次数:25次