期刊论文详细信息
Biodiversity Information Science and Standards
Developing Standards for Improved Data Quality and for Selecting Fit for Use Biodiversity Data
article
Arthur D Chapman1  Lee Belbin2  Paula F Zermoglio3  John Wieczorek4  Paul J Morris5  Miles Nicholls6  Emily Rose Rees2  Allan Koch Veiga7  Alexander Thompson8  Antonio Mauro Saraiva7  Shelley A James9  Christian Gendreau1,10  Abigail Benson1,11  Dmitry Schigel1,10 
[1] Australian Biodiversity Information Services;The Atlas of Living Australia;VertNet;Museum of Vertebrate Zoology, University of California;Museum of Comparative Zoology, Harvard University;Atlas of Living Australia;University of Sao Paulo;iDigBio;Department of Biodiversity;Global Biodiversity Information Facility - Secretariat;U.S. Geological Survey
关键词: data quality;    profile;    framework;    fitness for use;    standards;    tests and assertions;    data quality tests;    vocabularies;    Darwin Core;    GBIF;   
DOI  :  10.3897/biss.4.50889
来源: Pensoft
PDF
【 摘 要 】

The quality of biodiversity data publicly accessible via aggregators such as GBIF (Global Biodiversity Information Facility), the ALA (Atlas of Living Australia), iDigBio (Integrated Digitized Biocollections), and OBIS (Ocean Biogeographic Information System) is often questioned, especially by the research community.The Data Quality Interest Group, established by Biodiversity Information Standards (TDWG) and GBIF, has been engaged in four main activities: developing a framework for the assessment and management of data quality using a fitness for use approach; defining a core set of standardised tests and associated assertions based on Darwin Core terms; gathering and classifying user stories to form contextual-themed use cases, such as species distribution modelling, agrobiodiversity, and invasive species; and developing a standardised format for building and managing controlled vocabularies of values.Using the developed framework, data quality profiles have been built from use cases to represent user needs. Quality assertions can then be used to filter data suitable for a purpose. The assertions can also be used to provide feedback to data providers and custodians to assist in improving data quality at the source. A case study, using two different implementations of tests and assertions based around the Darwin Core "Event Date" terms, were also tested against GBIF data, to demonstrate that the tests are implementation agnostic, can be run on large aggregated datasets, and can make biodiversity data more fit for typical research uses.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO202307130001852ZK.pdf 2478KB PDF download
  文献评价指标  
  下载次数:0次 浏览次数:0次