期刊论文详细信息
Journal of Biomedical Semantics
Making species checklists understandable to machines – a shift from relational databases to ontologies
Eero Hyvönen1  Hannu Saarenmaa2  Jouni Tuominen1  Nina Laurenne1 
[1] Semantic Computing Research Group (SeCo), Department of Media Technology, Aalto University, P.O. Box 15500, 00076 Aalto, Espoo, Finland;Digitarium, University of Eastern Finland, P.O. Box 111, 80101 Joensuu, Finland
关键词: Species checklist;    Linked data;    Semantic web;    Ontology;    HTTP URI;    LSID;    Taxonomic concept;    Scientific name;   
Others  :  1133540
DOI  :  10.1186/2041-1480-5-40
 received in 2013-04-14, accepted in 2014-08-26,  发布年份 2014
【 摘 要 】

Background

The scientific names of plants and animals play a major role in Life Sciences as information is indexed, integrated, and searched using scientific names. The main problem with names is their ambiguous nature, because more than one name may point to the same taxon and multiple taxa may share the same name. In addition, scientific names change over time, which makes them open to various interpretations. Applying machine-understandable semantics to these names enables efficient processing of biological content in information systems. The first step is to use unique persistent identifiers instead of name strings when referring to taxa. The most commonly used identifiers are Life Science Identifiers (LSID), which are traditionally used in relational databases, and more recently HTTP URIs, which are applied on the Semantic Web by Linked Data applications.

Results

We introduce two models for expressing taxonomic information in the form of species checklists. First, we show how species checklists are presented in a relational database system using LSIDs. Then, in order to gain a more detailed representation of taxonomic information, we introduce meta-ontology TaxMeOn to model the same content as Semantic Web ontologies where taxa are identified using HTTP URIs. We also explore how changes in scientific names can be managed over time.

Conclusions

The use of HTTP URIs is preferable for presenting the taxonomic information of species checklists. An HTTP URI identifies a taxon and operates as a web address from which additional information about the taxon can be located, unlike LSID. This enables the integration of biological data from different sources on the web using Linked Data principles and prevents the formation of information silos. The Linked Data approach allows a user to assemble information and evaluate the complexity of taxonomical data based on conflicting views of taxonomic classifications. Using HTTP URIs and Semantic Web technologies also facilitate the representation of the semantics of biological data, and in this way, the creation of more “intelligent” biological applications and services.

【 授权许可】

   
2014 Laurenne et al.; licensee BioMed Central Ltd.

附件列表
Files Size Format View
Figure 5. 26KB Image download
Figure 4. 50KB Image download
Figure 3. 65KB Image download
Figure 2. 28KB Image download
Figure 1. 21KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

【 参考文献 】
  • [1]Patterson DJ, Cooper J, Kirk PM, Pyle RL, Remsen DP: Names are key to the big new biology. Trends Ecol Evol 2010, 25(12):686-691.
  • [2]Jones AC, White RJ, Orme ER: Identifying and relating biological concepts in the catalogue of life. J Biomed Semantics 2011, 2:7. BioMed Central Full Text
  • [3]Parr CS, Guralnick R, Cellinese N, Page RDM: Evolutionary informatics unifying knowledge about the diversity of life. Trends Ecol Evol 2012, 27(2):94-103.
  • [4]Segers H, de Smet WH, Fischer C, Fontaneto D, Michaloudi E, Wallace RL, Jersabek CD: Towards a list of available names in Zoology, partim Phylum Rotifera. Zootaxa 2012, 3179:61-68.
  • [5]Federhen S: The NCBI taxonomy database. Nucleic Acids Res 2012, 40(Database issue):D136-D143.
  • [6]Sarkar IN: Biodiversity informatics: organizing and linking information across the spectrum of life. Brief Bioinform 2007, 8(5):347-357.
  • [7]Fauna Europaea [http://www.faunaeur.org webcite]
  • [8]Atlas of living Australia [http://www.ala.org.au webcite]
  • [9]Encyclopedia of life [http://eol.org webcite]
  • [10]Catalogue of life [http://www.catalogueoflife.org webcite]
  • [11]ZooBank [http://iczn.org/content/about-zoobank webcite]
  • [12]Global Biodiversity information facility (GBIF) [http://www.gbif.org webcite]
  • [13]Checklist bank [https://github.com/gbif/checklistbank webcite]
  • [14]Taxonomic Names and Concepts Interest Group: Taxonomic concept transfer schema. 2005. [http://www.tdwg.org/standards/117 webcite]
  • [15]Darwin Core Task Group: Darwin core. 2009. [http://rs.tdwg.org/dwc webcite]
  • [16]Wieczorek J, Bloom D, Guralnick R, Blum S, Döring M, Giovanni R, Robertson T, Vieglais D: Darwin core: an evolving community-developed biodiversity data standard. PLoS ONE 2012, 7:e29715.
  • [17]Biodiversity information standards (TDWG) [http://www.tdwg.org webcite]
  • [18]Remsen D, Döring M, Robertson T: GBIF GNA profile reference guide for darwin core archive, core terms and extensions. Tech. rep., Global Biodiversity Information Facility (GBIF), Copenhagen, Denmark, 2011. [Version 1.2, released on 1 April 2011]
  • [19]NCBI national center for biotechnology information [http://www.ncbi.nlm.nih.gov webcite]
  • [20]Johnson NF, Musetti L: Genera of the parasitoid wasp family monomachidae (hymenoptera: diaprioidea). Zootaxa 2012, 3188:31-41.
  • [21]Berendsohn WG: The concept of “potential taxa” in databases. Taxon 1995, 44:207-212.
  • [22]Berendsohn WG: A taxonomic information model for botanical databases the IOPI Model. Taxon 1997, 46:283-309.
  • [23]Object Management Group (OMG): Life sciences identifiers final adopted specification. 2004. [http://www.omg.org/cgi-bin/doc?dtc/04-05-01 webcite]
  • [24]Page RDM: Taxonomic names, metadata, and the semantic web. Biodiversity Inform 2006, 3:1-15.
  • [25]World register of marine species [http://www.marinespecies.org webcite]
  • [26]uBio [http://www.ubio.org/ webcite]
  • [27]TDWG Globally Unique Identifiers Task Group (GUID): TDWG life science identifiers (LSID) applicability statement. 2007. [http://www.tdwg.org/fileadmin/subgroups/guid/LSID_Applicability_Statement_draft.pdf webcite]
  • [28]Cryer P, Hyam R, Miller C, Nicolson N, Tuama EO, Page R, Rees J, Riccardi G, Richards K, White R: Adoption of persistent identifiers for biodiversity informatics: Recommendations of the GBIF LSID GUID task group, 6. November 2009. Tech. rep., Global Biodiversity Information Facility (GBIF), Copenhagen, Denmark, 2010. [Version 1.1, last updated 21 Jan 2010]
  • [29]Richards K, White R, Nicolson N, Pyle R: A beginner’s guide to persistent identifiers. Tech. rep., Global Biodiversity Information Facility (GBIF), Copenhagen, Denmark 2011. [Version 1.0, released on 9 February 2011]
  • [30]Internet Assigned Numbers Authority (IANA): Uniform resource identifier (URI) schemes. [http://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml webcite]
  • [31]Heath T, Bizer C: Linked Data: Evolving the Web into a Global Data Space. Palo Alto, California: Morgan & Claypool; 2011. [Synthesis Lectures on the Semantic Web: Theory and Technology]
  • [32]Object management group (OMG) [http://www.omg.org webcite]
  • [33]Internet engineering task force (IETF) [http://www.ietf.org webcite]
  • [34]Schulz S, Stenzhorn H, Boeker M: The ontology of biological taxa. Bioinformatics 2008, 24(13):i313-i321.
  • [35]Franz NM, Peet RK: Towards a language for mapping relationships among taxonomic concepts. Syst Biodivers 2009, 7:5-20.
  • [36]Franz NM, Thau D: Biological taxonomy and ontology development: scope and limitations. Biodivers Inform 2010, 7:45-66.
  • [37]NCBO BioPortal [http://bioportal.bioontology.org webcite]
  • [38]Amphibian taxonomy (ATO) [http://purl.bioontology.org/ontology/ATO webcite]
  • [39]Fly taxonomy (FBsp) [http://purl.bioontology.org/ontology/FB-SP webcite]
  • [40]Teleost taxonomy (TTO) [http://purl.bioontology.org/ontology/TTO webcite]
  • [41]NCBI organismal classification (NCBITaxon) [http://purl.bioontology.org/ontology/NCBITaxon webcite]
  • [42]Viljanen K, Tuominen J, Hyvönen E: Ontology libraries for production use: the finnish ontology library service ONKI. In Proceedings of the 6th European Semantic Web Conference (ESWC): May 31–June 4 2009, Heraklion, Greece. Edited by Simperl E, Sabou M, Oren E, Mizoguchi R, Hyvönen E, Heath T, Cimiano P, Ciravegna F, Traverso P, Aroyo L. Berlin Heidelberg: Springer–Verlag; 2009:781-795.
  • [43]OBO flat file format 1.4 syntax and semantics [WORKING DRAFT] 2011. [http://purl.obolibrary.org/obo/oboformat/spec.html webcite]. [Mungall C, Ruttenberg A, Horrocks I, Osumi-Sutherland D (editors)]
  • [44]TaxonConcept.org [http://www.taxonconcept.org webcite]
  • [45]Tuominen J, Hyvönen E, Laurenne N: Biological names and taxonomies on the semantic web – managing the change in scientific conception. In Proceedings of the 8th Extended Semantic Web Conference (ESWC): May 29–June 2 2011; Heraklion, Greece. Edited by de Pan JZ, Leenheer P, Plexousakis D, Parsia BD, Simperl E, Grobelnik M, Antonio G. Berlin Heidelberg: Springer–Verlag; 2011:255-269.
  • [46]Tuominen J, Laurenne N: Taxonomic meta-ontology TaxMeOn specification. 2013. [http://schema.onki.fi/taxmeon webcite]
  • [47]Kirsten T, Gross A, Hartung M, Rahm E: GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution. J Biomed Semantics 2011, 2:6. BioMed Central Full Text
  • [48]Maynard D, Peters W, d’ Aquin M, Sabou M: Change Management for Metadata Evolution. In Proceedings of the International Workshop on Ontology Dynamics (IWOD), the 4th European Semantic Web Conference (ESWC): 7 June 2007, Innsbruck, Austria Edited by d’ Aquin M, Flouris G. 2007, 27-40.
  • [49]Euzenat J, Shvaiko P: Ontology Matching. Berlin Heidelberg: Springer–Verlag; 2007.
  • [50]Khattak AM, Latif K, Lee S: Change management in evolving web ontologies. J Knowledge-Based Syst 2013, 37:1-18.
  • [51]Wang S, Schlobach S, Klein M: Concept drift and how to identify it. J Web Semantics 2011, 9(3):247-265.
  • [52]Taxonomic database [http://taxon.luomus.fi webcite]
  • [53]DCMI Usage Board: DCMI metadata terms. 2012. [http://www.dublincore.org/documents/dcmi-terms/ webcite]
  • [54]Beckett D, Berners-Lee T: Turtle – Terse RDF triple language. 2011. [http://www.w3.org/TeamSubmission/turtle/ webcite]
  • [55]McNeill J, Barrie FR, Buck WR, Demoulin V, Greuter W, Hawksworth DL, Herendeen PS, Knapp S, Marhold K, Prado J, Prud’homme van Reine WF, Smith GF, Wiersema JH, Turland N: International Code of Nomenclature for algae, fungi, and plants (Melbourne Code), adopted by the Eighteenth International Botanical Congress Melbourne, Australia, July 2011. Königstein: Koeltz Scientific Books; 2012. [Regnum Vegetabile]
  • [56]International Comission on zoological nomenclature (ICZN) [http://iczn.org webcite]
  • [57]TDWG Biodiversity Information Standards: TDWG taxon rank LSID ontology. 2007. [http://rs.tdwg.org/ontology/voc/TaxonRank webcite]
  • [58]Silferberg H: Enumeratio Coleopterorum Fennoscandiae, Daniae at Baltiae. Helsinki, Finland: Helsingin Hyönteisvaihtoyhdistys; 1992.
  • [59]Silfverberg H: Enumeratio renovata Coleopterorum Fennoscandiae, Daniae et Baltiae. Sahlbergia 2011, 16(2):1-144.
  • [60]Kurki J, Hyvönen E: Collaborative Metadata editor integrated with ontology services and faceted portals. In Proceedings of the 1st Workshop on Ontology Repositories and Editors for the Semantic Web (ORES) the 7th Extended Semantic Web Conference (ESWC): 31 May 2010, Heraklion, Greece. Edited by Viljanen K, Lange C, García Castro A, d’Aquin M. CEUR Workshop Proceedings; 2010:7-11.
  • [61]Protégé ontology editor [http://protege.stanford.edu webcite]
  • [62]Finnish ontology library service ONKI [http://onki.fi webcite]
  • [63]The environmental observation web and its service applications within the future internet (ENVIROFI) [http://www.envirofi.eu webcite]
  • [64]Kennedy J, Kukla R, Paterson T: Scientific names are ambiguous as identifiers for biological taxa: their context and definition are required for accurate data integration. In Proceedings of the 2nd International Conference on Data Integration in the Life Sciences (DILS) 20–22 July 2005; San Diego, California. Edited by Raschid L, Ludäscher B. Berlin Heidelberg: Springer–Verlag; 2005:80-95.
  • [65]International federation of library associations and institutions (IFLA) [http://www.ifla.org/ webcite]
  • [66]IFLA Study Group on the Functional Requirements for Bibliographic Records: Functional requirements for bibliographic records : final report. München, Germany: K.G. Saur; 1998. [UBCIM publications; new series, vol 19]
  文献评价指标  
  下载次数:51次 浏览次数:18次