期刊论文详细信息
Biodiversity Information Science and Standards
The Standards behind the Scenes: Explaining data from the Plazi workflow
article
Donat Agosti1  Marcus Guidoti1  Terry Catapano1  Alexandros Ioannidis-Pantopikos2  Guido Sautter3 
[1] Plazi;CERN;IPD Böhm, Karlsruhe Institute of Technology
关键词: biodiversity;    data conversion;    standards;    MODS;    TaxPub;    JATS;    Darwin Core;    OBO;    biotic interactions;   
DOI  :  10.3897/biss.4.59178
来源: Pensoft
PDF
【 摘 要 】

As part of the CETAF COVID19 task force, Plazi liberated taxonomic treatments, figures, observation records, biotic interactions, taxonomic names, and collection and specimen codes involving bats and viruses from scholarly publications with the intention to create open access, findable, accessible, interoperable and reusable data (FAIR). The data is accessible via TreatmentBank and the Biodiversity Literature Repository (BLR) and it is continually harvested and reused by the Global Biodiversity Information Facility (GBIF) and Global Biotic Interactions (GloBI). This data was processed, enhanced and liberated by the Plazi workflow, which involves a dedicated infrastructure including a desktop application (GoldenGate Imagine) that converts portable document format files (PDF) to a dedicated open compressed file format (Image Markup File (IMF)) that is responsible for the data enhancement. To enhance the data contained in the publications, including the biological interactions, a series of standards and vocabularies are used. To the exception of TaxPub, which is a taxonomic specific extension of the U.S. National Center for Biotechnology Information's (NCBI) Journal Article Tag Suite (JATS), all other used vocabulary were previously proposed. This goes along with Plazi’s mission to reuse standards unless they are not available. The following standards of vocabularies are used: Metadata Object Description Schema (MODS) to model article metadata information on Plazi’s XMLs; Darwin Core for taxonomic ranks and materials citation related data; Open Biological and Biomedical Ontology (OBO); Relations Ontology for biological interactions between organisms. The latter two are also used in the custom metadata in the Biodiversity Literature Repository at Zenodo.In this presentation we will provide an overview of the different types of data followed by the standards or vocabularies applied for every and each one of them and their parts. The goal is to provide the context on how the data liberated by Plazi is described, which is extensively reused by third-party applications such as GBIF or GloBI. The use of the standards allows fully automated, daily data ingests by GBIF.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO202307130001789ZK.pdf 64KB PDF download
  文献评价指标  
  下载次数:2次 浏览次数:0次