BMC Bioinformatics | |
SUSHI: an exquisite recipe for fully documented, reproducible and reusable NGS data analysis | |
Software | |
Giancarlo Russo1  Lennart Opitz1  Hubert Rehrauer1  Ralph Schlapbach1  Weihong Qi1  Masaomi Hatakeyama2  | |
[1] Functional Genomics Center Zurich, ETH Zurich/University of Zurich, Winterthurerstrasse. 190, 8057, Zurich, Switzerland;Functional Genomics Center Zurich, ETH Zurich/University of Zurich, Winterthurerstrasse. 190, 8057, Zurich, Switzerland;Department of Evolutionary Biology and Environmental Studies, University of Zurich, Winterthurerstrasse. 190, 8057, Zurich, Switzerland; | |
关键词: Data analysis framework; Reproducible research; Meta-level system design; | |
DOI : 10.1186/s12859-016-1104-8 | |
received in 2016-02-20, accepted in 2016-05-26, 发布年份 2016 | |
来源: Springer | |
【 摘 要 】
BackgroundNext generation sequencing (NGS) produces massive datasets consisting of billions of reads and up to thousands of samples. Subsequent bioinformatic analysis is typically done with the help of open source tools, where each application performs a single step towards the final result. This situation leaves the bioinformaticians with the tasks to combine the tools, manage the data files and meta-information, document the analysis, and ensure reproducibility.ResultsWe present SUSHI, an agile data analysis framework that relieves bioinformaticians from the administrative challenges of their data analysis. SUSHI lets users build reproducible data analysis workflows from individual applications and manages the input data, the parameters, meta-information with user-driven semantics, and the job scripts. As distinguishing features, SUSHI provides an expert command line interface as well as a convenient web interface to run bioinformatics tools. SUSHI datasets are self-contained and self-documented on the file system. This makes them fully reproducible and ready to be shared. With the associated meta-information being formatted as plain text tables, the datasets can be readily further analyzed and interpreted outside SUSHI.ConclusionSUSHI provides an exquisite recipe for analysing NGS data. By following the SUSHI recipe, SUSHI makes data analysis straightforward and takes care of documentation and administration tasks. Thus, the user can fully dedicate his time to the analysis itself. SUSHI is suitable for use by bioinformaticians as well as life science researchers. It is targeted for, but by no means constrained to, NGS data analysis. Our SUSHI instance is in productive use and has served as data analysis interface for more than 1000 data analysis projects. SUSHI source code as well as a demo server are freely available.
【 授权许可】
CC BY
© Hatakeyama et al. 2016
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311103313959ZK.pdf | 1482KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]