期刊论文详细信息
Genomics & Informatics
Using the PubAnnotation ecosystem to perform agile text mining on : a tutorial review
Hee-Jo Nam1  Hyun-Seok Park1  Ryota Yamada2 
[1] Bioinformatics Laboratory, ELTEC College of Engineering, Ewha Womans University, Seoul 03760, Korea;Fuku Corporation, Tokyo 113-0033, Japan;
关键词: named entity recognition;    natural language processing;    text mining;   
DOI  :  10.5808/GI.2020.18.2.e13
来源: DOAJ
【 摘 要 】

The prototype version of the full-text corpus of Genomics & Informatics has recently been archived in a GitHub repository. The full-text publications of volumes 10 through 17 are also directly downloadable from PubMed Central (PMC) as XML files. During the Biomedical Linked Annotation Hackathon 6 (BLAH6), we experimented with converting, annotating, and updating 301 PMC full-text articles of Genomics & Informatics using PubAnnotation, a system that provides a convenient way to add PMC publications based on PMCID. Thus, this review aims to provide a tutorial overview of practicing the iterative task of named entity recognition with the PubAnnotation/PubDictionaries/TextAE ecosystem. We also describe developing a conversion tool between the Genia tagger output and the JSON format of PubAnnotation during the hackathon.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次