BMC Genomics | |
QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization | |
Software | |
Hualin Xi1  Jie Quan1  Michael Vincent2  Baohong Zhang2  Shanrong Zhao2  Li Xi2  Ying Zhang2  David von Schack2  | |
[1] Computational Sciences Center of Emphasis, Pfizer Worldwide Research and Development, 02139, Cambridge, MA, USA;PharmaTherapeutics Clinical R&D, Pfizer Worldwide Research and Development, 02139, Cambridge, MA, USA; | |
关键词: RNA-seq; Pipeline; Workflow; Automation; Visualization; Batch processing; High-performance computing; Large-scale data analysis; D3; jQuery; | |
DOI : 10.1186/s12864-015-2356-9 | |
received in 2015-11-16, accepted in 2015-12-23, 发布年份 2016 | |
来源: Springer | |
【 摘 要 】
BackgroundRNA sequencing (RNA-seq), a next-generation sequencing technique for transcriptome profiling, is being increasingly used, in part driven by the decreasing cost of sequencing. Nevertheless, the analysis of the massive amounts of data generated by large-scale RNA-seq remains a challenge. Multiple algorithms pertinent to basic analyses have been developed, and there is an increasing need to automate the use of these tools so as to obtain results in an efficient and user friendly manner. Increased automation and improved visualization of the results will help make the results and findings of the analyses readily available to experimental scientists.ResultsBy combing the best open source tools developed for RNA-seq data analyses and the most advanced web 2.0 technologies, we have implemented QuickRNASeq, a pipeline for large-scale RNA-seq data analyses and visualization. The QuickRNASeq workflow consists of three main steps. In Step #1, each individual sample is processed, including mapping RNA-seq reads to a reference genome, counting the numbers of mapped reads, quality control of the aligned reads, and SNP (single nucleotide polymorphism) calling. Step #1 is computationally intensive, and can be processed in parallel. In Step #2, the results from individual samples are merged, and an integrated and interactive project report is generated. All analyses results in the report are accessible via a single HTML entry webpage. Step #3 is the data interpretation and presentation step. The rich visualization features implemented here allow end users to interactively explore the results of RNA-seq data analyses, and to gain more insights into RNA-seq datasets. In addition, we used a real world dataset to demonstrate the simplicity and efficiency of QuickRNASeq in RNA-seq data analyses and interactive visualizations. The seamless integration of automated capabilites with interactive visualizations in QuickRNASeq is not available in other published RNA-seq pipelines.ConclusionThe high degree of automation and interactivity in QuickRNASeq leads to a substantial reduction in the time and effort required prior to further downstream analyses and interpretation of the analyses findings. QuickRNASeq advances primary RNA-seq data analyses to the next level of automation, and is mature for public release and adoption.
【 授权许可】
CC BY
© Zhao et al. 2016
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311102541988ZK.pdf | 3636KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
- [47]