| BMC Bioinformatics | |
| Exploiting sequence labeling framework to extract document-level relations from biomedical texts | |
| Yang Xiang1  Zhiheng Li2  Yuanyuan Sun2  Ling Luo2  Hongfei Lin2  Zhihao Yang2  | |
| [1] School of Biomedical Informatics, University of Texas Health Science Center at Houston;School of Computer Science and Technology, Dalian University of Technology; | |
| 关键词: Relation extraction; Document-level relation; Sequence labeling; | |
| DOI : 10.1186/s12859-020-3457-2 | |
| 来源: DOAJ | |
【 摘 要 】
Abstract Background Both intra- and inter-sentential semantic relations in biomedical texts provide valuable information for biomedical research. However, most existing methods either focus on extracting intra-sentential relations and ignore inter-sentential ones or fail to extract inter-sentential relations accurately and regard the instances containing entity relations as being independent, which neglects the interactions between relations. We propose a novel sequence labeling-based biomedical relation extraction method named Bio-Seq. In the method, sequence labeling framework is extended by multiple specified feature extractors so as to facilitate the feature extractions at different levels, especially at the inter-sentential level. Besides, the sequence labeling framework enables Bio-Seq to take advantage of the interactions between relations, and thus, further improves the precision of document-level relation extraction. Results Our proposed method obtained an F1-score of 63.5% on BioCreative V chemical disease relation corpus, and an F1-score of 54.4% on inter-sentential relations, which was 10.5% better than the document-level classification baseline. Also, our method achieved an F1-score of 85.1% on n2c2-ADE sub-dataset. Conclusion Sequence labeling method can be successfully used to extract document-level relations, especially for boosting the performance on inter-sentential relation extraction. Our work can facilitate the research on document-level biomedical text mining.
【 授权许可】
Unknown