BMC Bioinformatics | |
The gene normalization task in BioCreative III | |
Research | |
Patrick Ruch1  Dina Vishnyakova2  Kevin M Livingston3  Karin Verspoor3  Sergio Matos4  David Campos4  Richard Tzong-Han Tsai5  Hung-Yu Kao6  Chih-Hsuan Wei6  Jingchen Liu7  Minlie Huang7  Hong-Jie Dai8  Sanmitra Bhattacharya9  Padmini Srinivasan9  Hongfang Liu1,10  Illes Solt1,11  Martin Gerner1,12  Han-Cheol Cho1,13  Fabio Rinaldi1,14  Cheng-Ju Kuo1,15  Chun-Nan Hsu1,16  Naoaki Okazaki1,17  Manabu Torii1,18  Feifan Liu1,19  Shashank Agarwal1,19  Martin Romacker2,20  W John Wilbur2,21  Zhiyong Lu2,21  | |
[1] BiTeM Group, Information Science Department, University of Applied Science, Geneva, Switzerland;BiTem Group, Division of Medical Information Sciences, University of Geneva, Switzerland;Center for Computational Pharmacology, University of Colorado School of Medicine, Aurora, Colorado, USA;DETI/IEETA, University of Aveiro, Campus Universitário de Santiago, 3810-193, Aveiro, Portugal;Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, Taiwan, R.O.C;Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan, R.O.C;Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China;Department of Computer Science, National Tsing-Hua University, Hsinchu, Taiwan, R.O.C;Institute of Information Science, Academic Sinica, Taipei, Taiwan, R.O.C;Department of Computer Science, The University of Iowa, 52242, Iowa City, Iowa, USA;Department of Health Sciences Research, Mayo Clinic College of Medicine, MN55905, Rochester, USA;Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, 1117, Budapest, Hungary;Faculty of Life Sciences, University of Manchester, M13 9PT, Manchester, UK;Graduate School of Information Science and Technology, University of Tokyo, Japan;Institute of Computational Linguistics, University of Zurich, Zurich, Switzerland;Institute of Information Science, Academia Sinica, 115, Taipei, Taiwan;Institute of Information Science, Academia Sinica, 115, Taipei, Taiwan;Information Science Institute, University of Southern California, Marina del Rey, California, USA;Interfaculty Initiative in Information Studies, University of Tokyo, Japan;Lab of Text Intelligence in Biomedicine, Georgetown University Medical Center, 4000 Reservoir Rd., NW, 20057, Washington, DC, USA;Medical Informatics, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA;NITAS/TMS, Text Mining Services, Novartis, AG, Switzerland;National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, 20894, Bethesda, Maryland, USA; | |
关键词: Confidence Score; Expectation Maximization Algorithm; Name Entity Recognition; Gene Mention; Human Annotation; | |
DOI : 10.1186/1471-2105-12-S8-S2 | |
来源: Springer | |
【 摘 要 】
BackgroundWe report the Gene Normalization (GN) challenge in BioCreative III where participating teams were asked to return a ranked list of identifiers of the genes detected in full-text articles. For training, 32 fully and 500 partially annotated articles were prepared. A total of 507 articles were selected as the test set. Due to the high annotation cost, it was not feasible to obtain gold-standard human annotations for all test articles. Instead, we developed an Expectation Maximization (EM) algorithm approach for choosing a small number of test articles for manual annotation that were most capable of differentiating team performance. Moreover, the same algorithm was subsequently used for inferring ground truth based solely on team submissions. We report team performance on both gold standard and inferred ground truth using a newly proposed metric called Threshold Average Precision (TAP-k).ResultsWe received a total of 37 runs from 14 different teams for the task. When evaluated using the gold-standard annotations of the 50 articles, the highest TAP-k scores were 0.3297 (k=5), 0.3538 (k=10), and 0.3535 (k=20), respectively. Higher TAP-k scores of 0.4916 (k=5, 10, 20) were observed when evaluated using the inferred ground truth over the full test set. When combining team results using machine learning, the best composite system achieved TAP-k scores of 0.3707 (k=5), 0.4311 (k=10), and 0.4477 (k=20) on the gold standard, representing improvements of 12.4%, 21.8%, and 26.6% over the best team results, respectively.ConclusionsBy using full text and being species non-specific, the GN task in BioCreative III has moved closer to a real literature curation task than similar tasks in the past and presents additional challenges for the text mining community, as revealed in the overall team results. By evaluating teams using the gold standard, we show that the EM algorithm allows team submissions to be differentiated while keeping the manual annotation effort feasible. Using the inferred ground truth we show measures of comparative performance between teams. Finally, by comparing team rankings on gold standard vs. inferred ground truth, we further demonstrate that the inferred ground truth is as effective as the gold standard for detecting good team performance.
【 授权许可】
CC BY
© Lu et al; licensee BioMed Central Ltd. 2011
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311098655561ZK.pdf | 629KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
- [47]
- [48]
- [49]
- [50]
- [51]
- [52]
- [53]
- [54]
- [55]
- [56]
- [57]
- [58]