期刊论文详细信息
Journal of Computer Science
CTSS: A Tool for Efficient Information Extraction with Soft Matching Rules for Text Mining| Science Publications
A. Christy1  P. Thambidurai1 
关键词: Parsing;    trigram model;    soft matching;    information extraction;    recall;    precision;   
DOI  :  10.3844/jcssp.2008.375.381
学科分类:计算机科学(综合)
来源: Science Publications
PDF
【 摘 要 】

The abundance of information available digitally in modern world had made a demand for structured information. The problem of text mining which dealt with discovering useful information from unstructured text had attracted the attention of researchers. The role of Information Extraction (IE) software was to identify relevant information from texts, extracting information from a variety of sources and aggregating it to create a single view. Information extraction systems depended on particular corpora and were poor in recall values. Therefore, developing the system as domain-independent as well as improving the recall was an important challenge for IE. In this research, the authors proposed a domain-independent algorithm for information extraction, called SOFTRULEMINING for extracting the aim, methodology and conclusion from technical abstracts. The algorithm was implemented by combining trigram model with softmatching rules. A tool CTSS was constructed using SOFTRULEMINING and was tested with technical abstracts of www.computer.org and www.ansinet.org and found that the tool had improved its recall value and therefore the precision value in comparison with other search engines.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO201911300103687ZK.pdf 196KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:11次