期刊论文详细信息
BMC Bioinformatics
BioCreative III interactive task: an overview
Research
David Salgado1  Pascale Gaudet2  Cathy H Wu3  Cecilia N Arighi3  Sanmitra Bhattacharya4  Padmini Srinivasan4  Naoaki Okazaki5  Rune Sætre6  Philippe E Thomas7  Ulf Leser7  Lynette Hirschman8  Fabio Rinaldi9  Simon Clematide9  Lois J Maltais1,10  Feifan Liu1,11  Shashank Agarwal1,11  Luca Toldo1,12  Zhiyong Lu1,13  Phoebe M Roberts1,14  Ian Harrow1,14  Martin Krallinger1,15  Eva Huala1,16  Donghui Li1,16  Michelle Gwinn Giglio1,17  Livia Perfetto1,18  Gianni Cesareni1,19  Andrew Chatr-aryamontri2,20 
[1] Australian Regenerative Medicine Institute, Monash University, Melbourne, Victoria, Australia;Developmental Biology Institute of Marseille Luminy (IBDML), Université de la Méditerranée, Campus de Luminy, Marseille, France;CALIPHO group, Swiss Institutes of Bioinformatics, Geneva, Switzerland;dictyBase, NIBIC, Northwestern University, Chicago, IL, USA;Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA;Department of Computer Science, The University of Iowa, Iowa City, Iowa, USA;Department of Computer Science, University of Tokyo, Japan;Department of Computer Science, University of Tokyo, Japan;Department of Computer and Information Science, NTNU, Trondheim, Norway;Humboldt-Universität zu Berlin, Unter den Linden 6, 10099, Berlin, Germany;Information Technology Center, The MITRE Corporation, MA, Bedford, USA;Institute of Computational Linguistics, University of Zurich, Zurich, Switzerland;MGI, The Jackson Laboratory, Bar Harbor, ME, USA;Medical Informatics, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA;Merck KGaA, Darmstadt, Germany;National Center for Biotechnology Information (NCBI), Bethesda, MD, USA;Pfizer Research Technology Center, Cambridge, Massachusetts, USA;Structural and Computational Biology Group, Spanish National Cancer Research Centre (CNIO), Madrid, Spain;TAIR, Carnegie Institution for Science, Washington, DC, USA;University of Maryland, Baltimore, MD, USA;University of Rome Tor Vergata, Italy;University of Rome Tor Vergata, Italy;IRCCS Fondazione Santa Lucia, Italy;Wellcome Trust Centre for Cell Biology, University of Edinburgh, UK;
关键词: Text Mining;    Full Text Article;    Interactive Task;    Gene Mention;    NCBI Taxonomy;   
DOI  :  10.1186/1471-2105-12-S8-S4
来源: Springer
PDF
【 摘 要 】

BackgroundThe BioCreative challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. The biocurator community, as an active user of biomedical literature, provides a diverse and engaged end user group for text mining tools. Earlier BioCreative challenges involved many text mining teams in developing basic capabilities relevant to biological curation, but they did not address the issues of system usage, insertion into the workflow and adoption by curators. Thus in BioCreative III (BC-III), the InterActive Task (IAT) was introduced to address the utility and usability of text mining tools for real-life biocuration tasks. To support the aims of the IAT in BC-III, involvement of both developers and end users was solicited, and the development of a user interface to address the tasks interactively was requested.ResultsA User Advisory Group (UAG) actively participated in the IAT design and assessment. The task focused on gene normalization (identifying gene mentions in the article and linking these genes to standard database identifiers), gene ranking based on the overall importance of each gene mentioned in the article, and gene-oriented document retrieval (identifying full text papers relevant to a selected gene). Six systems participated and all processed and displayed the same set of articles. The articles were selected based on content known to be problematic for curation, such as ambiguity of gene names, coverage of multiple genes and species, or introduction of a new gene name. Members of the UAG curated three articles for training and assessment purposes, and each member was assigned a system to review. A questionnaire related to the interface usability and task performance (as measured by precision and recall) was answered after systems were used to curate articles. Although the limited number of articles analyzed and users involved in the IAT experiment precluded rigorous quantitative analysis of the results, a qualitative analysis provided valuable insight into some of the problems encountered by users when using the systems. The overall assessment indicates that the system usability features appealed to most users, but the system performance was suboptimal (mainly due to low accuracy in gene normalization). Some of the issues included failure of species identification and gene name ambiguity in the gene normalization task leading to an extensive list of gene identifiers to review, which, in some cases, did not contain the relevant genes. The document retrieval suffered from the same shortfalls. The UAG favored achieving high performance (measured by precision and recall), but strongly recommended the addition of features that facilitate the identification of correct gene and its identifier, such as contextual information to assist in disambiguation.DiscussionThe IAT was an informative exercise that advanced the dialog between curators and developers and increased the appreciation of challenges faced by each group. A major conclusion was that the intended users should be actively involved in every phase of software development, and this will be strongly encouraged in future tasks. The IAT Task provides the first steps toward the definition of metrics and functional requirements that are necessary for designing a formal evaluation of interactive curation systems in the BioCreative IV challenge.

【 授权许可】

CC BY   
© Arighi et al; licensee BioMed Central Ltd. 2011

【 预 览 】
附件列表
Files Size Format View
RO202311091517660ZK.pdf 4236KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  文献评价指标  
  下载次数:1次 浏览次数:0次