期刊论文详细信息
BMC Bioinformatics
Event extraction of bacteria biotopes: a knowledge-intensive NLP-based approach
Proceedings
Wiktoria Golik1  Pierre Warnier2  Zorana Ratkovic3 
[1] MIG INRA UR1077 Domaine de Vilvert, F-78352, Jouy-en-Josas, France;MIG INRA UR1077 Domaine de Vilvert, F-78352, Jouy-en-Josas, France;LIG - Université Joseph Fourier, BP 53, 385, rue de la Bibliothèque, F-38400, Saint-Martin-d'Hères, France;MIG INRA UR1077 Domaine de Vilvert, F-78352, Jouy-en-Josas, France;LaTTiCe UMR 8094 CNRS Université Paris 3, 1 rue Maurice Arnoux, F-92120, Montrouge, France;
关键词: Training Corpus;    Location Term;    Candidate Term;    Event Extraction;    Lexical Resource;   
DOI  :  10.1186/1471-2105-13-S11-S8
来源: Springer
PDF
【 摘 要 】

BackgroundBacteria biotopes cover a wide range of diverse habitats including animal and plant hosts, natural, medical and industrial environments. The high volume of publications in the microbiology domain provides a rich source of up-to-date information on bacteria biotopes. This information, as found in scientific articles, is expressed in natural language and is rarely available in a structured format, such as a database. This information is of great importance for fundamental research and microbiology applications (e.g., medicine, agronomy, food, bioenergy). The automatic extraction of this information from texts will provide a great benefit to the field.MethodsWe present a new method for extracting relationships between bacteria and their locations using the Alvis framework. Recognition of bacteria and their locations was achieved using a pattern-based approach and domain lexical resources. For the detection of environment locations, we propose a new approach that combines lexical information and the syntactic-semantic analysis of corpus terms to overcome the incompleteness of lexical resources. Bacteria location relations extend over sentence borders, and we developed domain-specific rules for dealing with bacteria anaphors.ResultsWe participated in the BioNLP 2011 Bacteria Biotope (BB) task with the Alvis system. Official evaluation results show that it achieves the best performance of participating systems. New developments since then have increased the F-score by 4.1 points.ConclusionsWe have shown that the combination of semantic analysis and domain-adapted resources is both effective and efficient for event information extraction in the bacteria biotope domain. We plan to adapt the method to deal with a larger set of location types and a large-scale scientific article corpus to enable microbiologists to integrate and use the extracted knowledge in combination with experimental data.

【 授权许可】

CC BY   
© Ratkovic et al.; licensee BioMed Central Ltd. 2012

【 预 览 】
附件列表
Files Size Format View
RO202311101908703ZK.pdf 748KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  文献评价指标  
  下载次数:12次 浏览次数:0次