期刊论文详细信息
BMC Bioinformatics
NOBLE – Flexible concept recognition for large-scale biomedical natural language processing
Software
Elizabeth Legowski1  Julia Corrigan1  Eugene Tseytlin1  Kevin Mitchell1  Rebecca S. Jacobson1  Girish Chavan1 
[1] Department of Biomedical Informatics, University of Pittsburgh School of Medicine, The Offices at Baum, 5607 Baum Boulevard, BAUM 423, Rm 523, 15206-3701, Pittsburgh, PA, USA;
关键词: Natural language processing;    Text-processing;    Named Entity Recognition;    Concept recognition;    Biomedical terminologies;    Auto-coding;    System evaluation;   
DOI  :  10.1186/s12859-015-0871-y
 received in 2015-07-31, accepted in 2015-12-22,  发布年份 2016
来源: Springer
PDF
【 摘 要 】

BackgroundNatural language processing (NLP) applications are increasingly important in biomedical data analysis, knowledge engineering, and decision support. Concept recognition is an important component task for NLP pipelines, and can be either general-purpose or domain-specific. We describe a novel, flexible, and general-purpose concept recognition component for NLP pipelines, and compare its speed and accuracy against five commonly used alternatives on both a biological and clinical corpus.NOBLE Coder implements a general algorithm for matching terms to concepts from an arbitrary vocabulary set. The system’s matching options can be configured individually or in combination to yield specific system behavior for a variety of NLP tasks. The software is open source, freely available, and easily integrated into UIMA or GATE. We benchmarked speed and accuracy of the system against the CRAFT and ShARe corpora as reference standards and compared it to MMTx, MGrep, Concept Mapper, cTAKES Dictionary Lookup Annotator, and cTAKES Fast Dictionary Lookup Annotator.ResultsWe describe key advantages of the NOBLE Coder system and associated tools, including its greedy algorithm, configurable matching strategies, and multiple terminology input formats. These features provide unique functionality when compared with existing alternatives, including state-of-the-art systems. On two benchmarking tasks, NOBLE’s performance exceeded commonly used alternatives, performing almost as well as the most advanced systems. Error analysis revealed differences in error profiles among systems.ConclusionNOBLE Coder is comparable to other widely used concept recognition systems in terms of accuracy and speed. Advantages of NOBLE Coder include its interactive terminology builder tool, ease of configuration, and adaptability to various domains and tasks. NOBLE provides a term-to-concept matching system suitable for general concept recognition in biomedical NLP pipelines.

【 授权许可】

CC BY   
© Tseytlin et al. 2016

【 预 览 】
附件列表
Files Size Format View
RO202311107961137ZK.pdf 2683KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]
  • [56]
  • [57]
  • [58]
  • [59]
  文献评价指标  
  下载次数:1次 浏览次数:4次