期刊论文详细信息
Biodiversity Information Science and Standards
Unity in Variety: Developing a collection description standard by consensus
article
Matt Woodburn1  Deborah L Paul2  Wouter Addink3  Steven J Baskauf6  Stanley Blum7  Cat Chapman8  Sharon Grant9  Quentin Groom1,10  Janeen Jones9  Mareike Petersen1,11  Niels Raes1,12  David Smith1  Laura Tilley1,13  Maarten Trekels1,10  Michael Trizna1,14  William Ulate1,15  Sarah Vincent1  Ramona Walls1,17  Kate Webbink9  Paula Zermoglio1,19 
[1] Natural History Museum;Florida State University;Naturalis Biodiversity Center;Distributed System of Scientific Collections - DiSSCo;Species 2000 Secretariat;Vanderbilt University Libraries;Biodiversity Information Standards;University of Florida;Field Museum;Meise Botanic Garden;Museum für Naturkunde Berlin, Leibniz Institute for Evolution and Biodiversity Science;NLBIF;CETAF;Smithsonian Institution;CRBio;Missouri Botanical Garden;University of Arizona;CyVerse;VertNet
关键词: collection descriptions;    TDWG;    data standards;    biodiversity;    geodiversity;    natural sciences;   
DOI  :  10.3897/biss.4.59233
来源: Pensoft
PDF
【 摘 要 】

Digitisation and publication of museum specimen data is happening worldwide, but far from complete. Museums can start by sharing what they know about their holdings at a higher level, long before each object has its own record. Information about what is held in collections worldwide is needed by many stakeholders including collections managers, funders, researchers, policy-makers, industry, and educators. To aggregate this information from collections, the data need to be standardised (Johnston and Robinson 2002). So, the Biodiversity Information Standards (TDWG) Collection Descriptions (CD) Task Group is developing a data standard for describing collections, which gives the ability to provide:automated metrics, using standardised collection descriptions and/or data derived from specimen datasets (e.g., counts of specimens) anda global registry of physical collections (i.e., digitised or non-digitised).Outputs will include a data model to underpin the new standard, and guidance and reference implementations for the practical use of the standard in institutional and collaborative data infrastructures.The Task Group employs a community-driven approach to standard development. With international participation, workshops at the Natural History Museum (London 2019) and the MOBILISE workshop (Warsaw 2020) allowed over 50 people to contribute this work. Our group organized online "barbecues" (BBQs) so that many more could contribute to standard definitions and address data model design challenges. Cloud-based tools (e.g., GitHub, Google Sheets) are used to organise and publish the group's work and make it easy to participate. A Wikibase instance is also used to test and demonstrate the model using real data.There are a range of global, regional, and national initiatives interested in the standard (see Task Group charter). Some, like GRSciColl (now at the Global Biodiversity Information Facility (GBIF)), Index Herbariorum (IH), and the iDigBio US Collections List are existing catalogues. Others, including the Consortium of European Taxonomic Facilities (CETAF) and the Distributed System of Scientific Collections (DiSSCo), include collection descriptions as a key part of their near-term development plans. As part of the EU-funded SYNTHESYS+ project, GBIF organized a virtual workshop: Advancing the Catalogue of the World's Natural History Collections to get international input for such a resource that would use this CD standard.Some major complexities present themselves in designing a standardised approach to represent collection descriptions data. It is not the first time that the natural science collections community has tried to address them (see the TDWG Natural Collections Description standard). Beyond natural sciences, the library community in particular gave thought to this (Heaney 2001, Johnston and Robinson 2002), noting significant difficulties. One hurdle is that collections may be broken down into different degrees of granularity according to different criteria, and may also overlap so that a single object can be represented in more than one collection description. Managing statistics such as numbers of objects is complex due to data gaps and variable degrees of certainty about collection contents. It also takes considerable effort from collections staff to generate structured data about their undigitised holdings. We need to support simple, high-level collection summaries as well as detailed quantitative data, and to be able to update as needed. We need a simple approach, but one that can also handle the complexities of data, scope, and social needs, for digitised and undigitised collections.The data standard itself is a defined set of classes and properties that can be used to represent groups of collection objects and their associated information. These incorporate common characteristics ('dimensions') by which we want to describe, group and break down our collections, metrics for quantifying those collections, and properties such as persistent identifiers for tracking collections and managing their digital counterparts. Existing terms from other standards (e.g. Darwin Core, ABCD) are re-used if possible.The data model (Fig. 1) underpinning the standard defines the relationships between those different classes, and ensures that the structure as well as the content are comparable across different datasets. It centres around the core concept of an 'object group', representing a set of physical objects that is defined by one or more dimensions (e.g., taxonomy and geographic origin), and linked to other entities such as the holding institution. To the object group, quantitative data about its contents are attached (e.g. counts of objects or taxa), along with more qualitative information describing the contents of the group as a whole. In this presentation, we will describe the draft standard and data model with examples of early adoption for real-world and example data. We will also discuss the vision of how the new standard may be adopted and its potential impact on collection discoverability across the collections community.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO202307130001781ZK.pdf 199KB PDF download
  文献评价指标  
  下载次数:6次 浏览次数:0次