期刊论文详细信息
PeerJ
The reuse of public datasets in the life sciences: potential risks and rewards
article
Katharina Sielemann1  Alenka Hafner1  Boas Pucker1 
[1] Genetics and Genomics of Plants, Center for Biotechnology ,(CeBiTec) & Faculty of Biology, Bielefeld University;Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure ,(BIBI), Bielefeld University;Current Affiliation: Intercollege Graduate Degree Program in Plant Biology, Penn State University, University Park, State College;Evolution and Diversity, Department of Plant Sciences, University of Cambridge
关键词: Reuse;    Data science;    Sequencing data;    Genomics;    Bioinformatics;    Databases;    Computational biology;    Open science;   
DOI  :  10.7717/peerj.9954
学科分类:社会科学、人文和艺术(综合)
来源: Inra
PDF
【 摘 要 】

The ‘big data’ revolution has enabled novel types of analyses in the life sciences, facilitated by public sharing and reuse of datasets. Here, we review the prodigious potential of reusing publicly available datasets and the associated challenges, limitations and risks. Possible solutions to issues and research integrity considerations are also discussed. Due to the prominence, abundance and wide distribution of sequencing data, we focus on the reuse of publicly available sequence datasets. We define ‘successful reuse’ as the use of previously published data to enable novel scientific findings. By using selected examples of successful reuse from different disciplines, we illustrate the enormous potential of the practice, while acknowledging the respective limitations and risks. A checklist to determine the reuse value and potential of a particular dataset is also provided. The open discussion of data reuse and the establishment of this practice as a norm has the potential to benefit all stakeholders in the life sciences.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202307100007461ZK.pdf 4859KB PDF download
  文献评价指标  
  下载次数:4次 浏览次数:0次