BMC Bioinformatics | |
Exploiting sparseness in de novo genome assembly | |
Proceedings | |
Zhanshan Sam Ma1  Mihai Pop2  Charles H Cannon3  Chengxi Ye4  Douglas W Yu5  | |
[1] Computational Biology and Medical Ecology Lab; State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, 650223, Kunming, Yunnan, China;Department of Computer Science and Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, College Park, MD, USA;Ecological Evolution Group, Xishuangbanna Tropical Botanic Garden, Chinese Academy of Sciences, 666303, Menglun, Yunnan, China;Department of Biological Sciences, Texas Tech University, 79410, Lubbock, TX, USA;Ecology & Evolution of Plant-Animal Interaction Group, Xishuangbanna Tropical Botanic Garden, Chinese Academy of Sciences, 666303, Menglun, Yunnan, China;Ecology, Conservation, and Environment Center; State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, 650223, Kunming, Yunnan, China;Department of Computer Science and Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, College Park, MD, USA;Ecology, Conservation, and Environment Center; State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, 650223, Kunming, Yunnan, China;School of Biological Sciences, University of East Anglia, NR47TJ, Norwich, Norfolk, UK; | |
关键词: Genome Assembly; Memory Requirement; Sequencing Error; Memory Usage; Sparse Graph; | |
DOI : 10.1186/1471-2105-13-S6-S1 | |
来源: Springer | |
【 摘 要 】
BackgroundThe very large memory requirements for the construction of assembly graphs for de novo genome assembly limit current algorithms to super-computing environments.MethodsIn this paper, we demonstrate that constructing a sparse assembly graph which stores only a small fraction of the observed k- mers as nodes and the links between these nodes allows the de novo assembly of even moderately-sized genomes (~500 M) on a typical laptop computer.ResultsWe implement this sparse graph concept in a proof-of-principle software package, SparseAssembler, utilizing a new sparse k- mer graph structure evolved from the de Bruijn graph. We test our SparseAssembler with both simulated and real data, achieving ~90% memory savings and retaining high assembly accuracy, without sacrificing speed in comparison to existing de novo assemblers.
【 授权许可】
CC BY
© Ye et al.; licensee BioMed Central Ltd. 2012
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311107476008ZK.pdf | 796KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]