期刊论文详细信息
Biodiversity Information Science and Standards
Data Quality Task Group 2: Tests and Assertions
article
Lee Belbin1  Arthur D Chapman2  John Wieczorek3  Paula Zermoglio4  Paul J Morris5 
[1] Atlas of Living Australia;Australian Biodiversity Information Services;Museum of Vertebrate Zoology, University of California;Instituto de Ecología, Genética y Evolución de Buenos Aires ,(IEGEBA-CONICET), University of Buenos Aires;Museum of Comparative Zoology, Harvard University
关键词: Fitness for use;    fitness for purpose;     tests;    Darwin Core;    quality.;   
DOI  :  10.3897/biss.3.35626
来源: Pensoft
PDF
【 摘 要 】

‘Data Quality Test and Assertions’ Task Group 2 (https://www.tdwg.org/community/bdq/tg-2/) has taken another year to clarify the 102 tests (https://github.com/tdwg/bdq/issues?q=is%3Aissue+is%3Aopen+label%3ATest). The original mandate to develop a core suite of tests that could be widely applied from data collection to user evaluation of aggregated data seemed straight-forward. Two years down the track, we have proven that to be incorrect. Among the final tests are complexities that none of the core group anticipated. For example, the need for a definition of ‘empty’ or the ‘Expected response’ from the test under various scenarios.The record-based tests apply to Darwin Core terms (https://dwc.tdwg.org/terms/) and have been classified as of type validation (66), amendment (29), notification (3) or measure (5). Validations test one or more Darwin Core terms against known characteristics, for example, VALIDATION_MONTH_NOTSTANDARD. Amendments may be applied to Darwin Core terms where we can unambiguously offer an improvement to the record, for example, AMENDMENT_MONTH_STANDARDIZED. Notifications are made where we believe a flag will help alert users to an issue that needs evaluation, for example, NOTIFICATION_DATAGENERALIZATIONS_NOTEMPTY. Measures are summaries of test outcomes at the record level, for example, MEASURE_AMENDMENTS_PROPOSED.We note that 41 require some parameters to be established at the time of test implementation, 20 tests require access to a currently accepted vocabulary and 3 tests rely on ISO/DCMI standards. The dependency on vocabularies to circumscribe permissible values for Darwin Core terms led to the establishment by Paula Zermoglio of DQ Task Group 4 (https://github.com/tdwg/bdq/tree/master/Vocabularies). A vocabulary of 154 terms that are associated with the tests and assertions have been developed.  As at the time of writing this abstract, test data and demonstration code implementation of each test are yet to be completed. We hope these will be finalized by the time of this presentation.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO202307130001929ZK.pdf 62KB PDF download
  文献评价指标  
  下载次数:6次 浏览次数:4次