BMC Genomics | |
A divide-and-conquer algorithm for large-scale de novo transcriptome assembly through combining small assemblies from existing algorithms | |
Research | |
Sing-Hoi Sze1  Aaron M. Tarone2  Jonathan J. Parrott2  | |
[1] Department of Computer Science and Engineering, Texas A&M University, College Station, 77843, Mexico, TX, USA;Department of Biochemistry & Biophysics, Texas A&M University, College Station, 77843, Mexico, TX, USA;Department of Entomology, Texas A&M University, College Station, 77843, Mexico, TX, USA; | |
关键词: Divide-and-conquer; RNA-Seq; de novo; | |
DOI : 10.1186/s12864-017-4270-9 | |
来源: Springer | |
【 摘 要 】
BackgroundWhile the continued development of high-throughput sequencing has facilitated studies of entire transcriptomes in non-model organisms, the incorporation of an increasing amount of RNA-Seq libraries has made de novo transcriptome assembly difficult. Although algorithms that can assemble a large amount of RNA-Seq data are available, they are generally very memory-intensive and can only be used to construct small assemblies.ResultsWe develop a divide-and-conquer strategy that allows these algorithms to be utilized, by subdividing a large RNA-Seq data set into small libraries. Each individual library is assembled independently by an existing algorithm, and a merging algorithm is developed to combine these assemblies by picking a subset of high quality transcripts to form a large transcriptome. When compared to existing algorithms that return a single assembly directly, this strategy achieves comparable or increased accuracy as memory-efficient algorithms that can be used to process a large amount of RNA-Seq data, and comparable or decreased accuracy as memory-intensive algorithms that can only be used to construct small assemblies.ConclusionsOur divide-and-conquer strategy allows memory-intensive de novo transcriptome assembly algorithms to be utilized to construct large assemblies.
【 授权许可】
CC BY
© The Author(s) 2017
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311102890000ZK.pdf | 1014KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]