BMC Bioinformatics | |
Persistent memory as an effective alternative to random access memory in metagenome assembly | |
Research Article | |
Rob Egan1  Harrison Ho2  Zhong Wang3  Yue Li4  Jingchao Sun4  Zhining Qiu4  | |
[1] Department of Energy Joint Genome Institute, 94720, Berkeley, CA, USA;Department of Energy Joint Genome Institute, 94720, Berkeley, CA, USA;School of Natural Sciences, University of California at Merced, 95343, Merced, CA, USA;Department of Energy Joint Genome Institute, 94720, Berkeley, CA, USA;School of Natural Sciences, University of California at Merced, 95343, Merced, CA, USA;Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, 94720, Berkeley, CA, USA;MemVerge Inc, 95035, Milpitas, CA, USA; | |
关键词: Metagenome assembly; Persistent memory; Out-of-memory; | |
DOI : 10.1186/s12859-022-05052-8 | |
received in 2022-05-04, accepted in 2022-11-11, 发布年份 2022 | |
来源: Springer | |
【 摘 要 】
BackgroundThe assembly of metagenomes decomposes members of complex microbe communities and allows the characterization of these genomes without laborious cultivation or single-cell metagenomics. Metagenome assembly is a process that is memory intensive and time consuming. Multi-terabyte sequences can become too large to be assembled on a single computer node, and there is no reliable method to predict the memory requirement due to data-specific memory consumption pattern. Currently, out-of-memory (OOM) is one of the most prevalent factors that causes metagenome assembly failures.ResultsIn this study, we explored the possibility of using Persistent Memory (PMem) as a less expensive substitute for dynamic random access memory (DRAM) to reduce OOM and increase the scalability of metagenome assemblers. We evaluated the execution time and memory usage of three popular metagenome assemblers (MetaSPAdes, MEGAHIT, and MetaHipMer2) in datasets up to one terabase. We found that PMem can enable metagenome assemblers on terabyte-sized datasets by partially or fully substituting DRAM. Depending on the configured DRAM/PMEM ratio, running metagenome assemblies with PMem can achieve a similar speed as DRAM, while in the worst case it showed a roughly two-fold slowdown. In addition, different assemblers displayed distinct memory/speed trade-offs in the same hardware/software environment.ConclusionsWe demonstrated that PMem is capable of expanding the capacity of DRAM to allow larger metagenome assembly with a potential tradeoff in speed. Because PMem can be used directly without any application-specific code modification, these findings are likely to be generalized to other memory-intensive bioinformatics applications.
【 授权许可】
CC BY
© The Author(s) 2022
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202305068817990ZK.pdf | 998KB | download | |
Fig. 2 | 1620KB | Image | download |
Fig. 6 | 395KB | Image | download |
Fig. 2 | 612KB | Image | download |
Fig. 1 | 866KB | Image | download |
12864_2022_9026_Article_IEq248.gif | 1KB | Image | download |
【 图 表 】
12864_2022_9026_Article_IEq248.gif
Fig. 1
Fig. 2
Fig. 6
Fig. 2
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]