科技报告详细信息
New Frontiers For An Artificial Immune System
Greensmith, Julie
HP Development Company
关键词: artificial immune system;    document classification;    feature vectors;    AIRS;   
RP-ID  :  HPL-2003-204
学科分类:计算机科学(综合)
美国|英语
来源: HP Labs
PDF
【 摘 要 】

AIRS, a resource limited artificial immune classifier system, has performed well on various classification tasks, including data clustering. This thesis proposes the use of this system for the complex task of multi- class document classification. Initially the AIRS system is validated using a standard machine learning dataset, which has not been used previously with this classifier. The use of AIRS for the purpose of document classification was then examined. This includes the pre-processing of HTML documents and the extraction, selection and representation of features, for the purpose of feature vector compilation. AIRS was used to classify various Internet documents, using a variety of datasets. Comparisons were made where the amount of documents, amount of classes and amount of features were varied independently. Additionally, AIRS was compared with another text classification package as a benchmarking exercise. On completion of this we are confident that AIRS is a suitable candidate for increasingly more complex tasks such as hierarchical document classification and multiple taxonomic mappings. 71 Pages

【 预 览 】
附件列表
Files Size Format View
RO201804100000575LZ 222KB PDF download
  文献评价指标  
  下载次数:5次 浏览次数:29次