期刊论文详细信息
BMC Bioinformatics
The biomedical discourse relation bank
Research Article
Susan McRoy1  Nadya Frid2  Hong Yu3  Rashmi Prasad4  Aravind Joshi5 
[1] Department of Electrical Engineering and Computer Science, University of Wisconsin-Milwaukee, P.O. Box 784, 53201, Milwaukee, WI, USA;Department of Health Sciences, University of Wisconsin-Milwaukee, P.O. Box 413, 53201, Milwaukee, WI, USA;Department of Health Sciences, University of Wisconsin-Milwaukee, P.O. Box 413, 53201, Milwaukee, WI, USA;Department of Electrical Engineering and Computer Science, University of Wisconsin-Milwaukee, P.O. Box 784, 53201, Milwaukee, WI, USA;Institute for Research in Cognitive Science, University of Pennsylvania, 3401 Walnut Street, 19104, Philadelphia, PA, USA;Institute for Research in Cognitive Science, University of Pennsylvania, 3401 Walnut Street, 19104, Philadelphia, PA, USA;Department of Computer and Information Science, University of Pennsylvania, 3330 Walnut Street, 19104, Philadelphia, PA, USA;
关键词: Discourse Relation;    Biomedical Literature;    Sense Type;    Biomedical Domain;    Biomedical Text;   
DOI  :  10.1186/1471-2105-12-188
 received in 2010-10-14, accepted in 2011-05-23,  发布年份 2011
来源: Springer
PDF
【 摘 要 】

BackgroundIdentification of discourse relations, such as causal and contrastive relations, between situations mentioned in text is an important task for biomedical text-mining. A biomedical text corpus annotated with discourse relations would be very useful for developing and evaluating methods for biomedical discourse processing. However, little effort has been made to develop such an annotated resource.ResultsWe have developed the Biomedical Discourse Relation Bank (BioDRB), in which we have annotated explicit and implicit discourse relations in 24 open-access full-text biomedical articles from the GENIA corpus. Guidelines for the annotation were adapted from the Penn Discourse TreeBank (PDTB), which has discourse relations annotated over open-domain news articles. We introduced new conventions and modifications to the sense classification. We report reliable inter-annotator agreement of over 80% for all sub-tasks. Experiments for identifying the sense of explicit discourse connectives show the connective itself as a highly reliable indicator for coarse sense classification (accuracy 90.9% and F1 score 0.89). These results are comparable to results obtained with the same classifier on the PDTB data. With more refined sense classification, there is degradation in performance (accuracy 69.2% and F1 score 0.28), mainly due to sparsity in the data. The size of the corpus was found to be sufficient for identifying the sense of explicit connectives, with classifier performance stabilizing at about 1900 training instances. Finally, the classifier performs poorly when trained on PDTB and tested on BioDRB (accuracy 54.5% and F1 score 0.57).ConclusionOur work shows that discourse relations can be reliably annotated in biomedical text. Coarse sense disambiguation of explicit connectives can be done with high reliability by using just the connective as a feature, but more refined sense classification requires either richer features or more annotated data. The poor performance of a classifier trained in the open domain and tested in the biomedical domain suggests significant differences in the semantic usage of connectives across these domains, and provides robust evidence for a biomedical sublanguage for discourse and the need to develop a specialized biomedical discourse annotated corpus. The results of our cross-domain experiments are consistent with related work on identifying connectives in BioDRB.

【 授权许可】

Unknown   
© Prasad et al; licensee BioMed Central Ltd. 2011. This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

【 预 览 】
附件列表
Files Size Format View
RO202311106296672ZK.pdf 418KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]
  • [56]
  • [57]
  • [58]
  • [59]
  • [60]
  • [61]
  • [62]
  • [63]
  • [64]
  • [65]
  • [66]
  • [67]
  • [68]
  • [69]
  • [70]
  • [71]
  • [72]
  • [73]
  • [74]
  • [75]
  • [76]
  • [77]
  • [78]
  • [79]
  • [80]
  • [81]
  • [82]
  • [83]
  • [84]
  • [85]
  • [86]
  • [87]
  • [88]
  • [89]
  • [90]
  • [91]
  • [92]
  • [93]
  • [94]
  • [95]
  • [96]
  • [97]
  • [98]
  • [99]
  • [100]
  • [101]
  • [102]
  文献评价指标  
  下载次数:1次 浏览次数:0次