期刊论文详细信息
BMC Bioinformatics
Fast, accurate, and lightweight analysis of BS-treated reads with ERNE 2
Research Article
Nicola Prezza1  Alberto Policriti2  Francesco Vezzi3  Max Käller3 
[1] Department of Mathematics and Informatics, University of Udine, via delle Scienze, 33100, Udine, Italy;Department of Mathematics and Informatics, University of Udine, via delle Scienze, 33100, Udine, Italy;Institute of Applied Genomics, via J. Linussio, 33100, Udine, Italy;Science for Life Laboratory, Tomtebodavägen 23A, 17165, Solna, Sweden;
关键词: Bisulfite;    DNA methylation;    NGS;    Succinct hashing;    BWT;   
DOI  :  10.1186/s12859-016-0910-3
来源: Springer
PDF
【 摘 要 】

BackgroundBisulfite treatment of DNA followed by sequencing (BS-seq) has become a standard technique in epigenetic studies, providing researchers with tools for generating single-base resolution maps of whole methylomes. Aligning bisulfite-treated reads, however, is a computationally difficult task: bisulfite treatment decreases the (lexical) complexity of low-methylated genomic regions, and C-to-T mismatches may reflect cytosine unmethylation rather than SNPs or sequencing errors. Further challenges arise both during and after the alignment phase: data structures used by the aligner should be fast and should fit into main memory, and the methylation-caller output should be somehow compressed, due to its significant size.MethodsAs far as data structures employed to align bisulfite-treated reads are concerned, solutions proposed in the literature can be roughly grouped into two main categories: those storing pointers at each text position (e.g. hash tables, suffix trees/arrays), and those using the information-theoretic minimum number of bits (e.g. FM indexes and compressed suffix arrays). The former are fast and memory consuming. The latter are much slower and light. In this paper, we try to close this gap proposing a data structure for aligning bisulfite-treated reads which is at the same time fast, light, and very accurate. We reach this objective by combining a recent theoretical result on succinct hashing with a bisulfite-aware hash function. Furthermore, the new versions of the tools implementing our ideas|the aligner ERNE-BS5 2 and the caller ERNE-METH 2|have been extended with increased downstream compatibility (EPP/Bismark cov output formats), output compression, and support for target enrichment protocols.ResultsExperimental results on public and simulated WGBS libraries show that our algorithmic solution is a competitive tradeoff between hash-based and BWT-based indexes, being as fast and accurate as the former, and as memory-efficient as the latter.ConclusionsThe new functionalities of our bisulfite aligner and caller make it a fast and memory efficient tool, useful to analyze big datasets with little computational resources, to easily process target enrichment data, and produce statistics such as protocol efficiency and coverage as a function of the distance from target regions.

【 授权许可】

CC BY   
© Prezza et al. 2016

【 预 览 】
附件列表
Files Size Format View
RO202311098556681ZK.pdf 884KB PDF download
12864_2017_4130_Article_IEq17.gif 1KB Image download
12864_2017_4274_Article_IEq6.gif 1KB Image download
12888_2017_1504_Article_IEq1.gif 1KB Image download
12864_2017_3527_Article_IEq3.gif 1KB Image download
12864_2016_3098_Article_IEq57.gif 1KB Image download
12864_2017_4030_Article_IEq8.gif 1KB Image download
12864_2016_2682_Article_IEq48.gif 1KB Image download
12914_2017_112_Article_IEq4.gif 1KB Image download
12864_2016_2682_Article_IEq50.gif 1KB Image download
12864_2015_2055_Article_IEq94.gif 1KB Image download
12864_2017_4030_Article_IEq12.gif 1KB Image download
12864_2017_4030_Article_IEq13.gif 1KB Image download
12864_2016_2821_Article_IEq47.gif 1KB Image download
12864_2015_2296_Article_IEq24.gif 1KB Image download
12864_2016_3001_Article_IEq3.gif 1KB Image download
【 图 表 】

12864_2016_3001_Article_IEq3.gif

12864_2015_2296_Article_IEq24.gif

12864_2016_2821_Article_IEq47.gif

12864_2017_4030_Article_IEq13.gif

12864_2017_4030_Article_IEq12.gif

12864_2015_2055_Article_IEq94.gif

12864_2016_2682_Article_IEq50.gif

12914_2017_112_Article_IEq4.gif

12864_2016_2682_Article_IEq48.gif

12864_2017_4030_Article_IEq8.gif

12864_2016_3098_Article_IEq57.gif

12864_2017_3527_Article_IEq3.gif

12888_2017_1504_Article_IEq1.gif

12864_2017_4274_Article_IEq6.gif

12864_2017_4130_Article_IEq17.gif

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  文献评价指标  
  下载次数:5次 浏览次数:2次