| BMC Genomics | |
| Bis-class: a new classification tool of methylation status using bayes classifier and local methylation information | |
| Soojin V Yi2  Taesung Park1  Xingyu Yang2  Iksoo Huh1  | |
| [1] Department of Statistics, Bioinformatics and Biostatistics Laboratory, Seoul National University, 151-742 Seoul, Korea;School of Biology, Institute of Bioengineering and Biosciences, Georgia Institute of Technology, 310 Ferst Drive, 30332 Atlanta, GA, USA | |
| 关键词: MethylC-seq; Local DNA methylation level; Bayes classifier; DNA methylation; | |
| Others : 1216445 DOI : 10.1186/1471-2164-15-608 |
|
| received in 2014-03-15, accepted in 2014-07-07, 发布年份 2014 | |
PDF
|
|
【 摘 要 】
Background
Whole genome sequencing of bisulfite converted DNA (‘methylC-seq’) method provides comprehensive information of DNA methylation. An important application of these whole genome methylation maps is classifying each position as a methylated versus non-methylated nucleotide. A widely used current method for this purpose, the so-called binomial method, is intuitive and straightforward, but lacks power when the sequence coverage and the genome-wide methylation level are low. These problems present a particular challenge when analyzing sparsely methylated genomes, such as those of many invertebrates and plants.
Results
We demonstrate that the number of sequence reads per position from methylC-seq data displays a large variance and can be modeled as a shifted negative binomial distribution. We also show that DNA methylation levels of adjacent CpG sites are correlated, and this similarity in local DNA methylation levels extends several kilobases. Taking these observations into account, we propose a new method based on Bayesian classification to infer DNA methylation status while considering the neighborhood DNA methylation levels of a specific site. We show that our approach has higher sensitivity and better classification performance than the binomial method via multiple analyses, including computational simulations, Area Under Curve (AUC) analyses, and improved consistencies across biological replicates. This method is especially advantageous in the analyses of sparsely methylated genomes with low coverage.
Conclusions
Our method improves the existing binomial method for binary methylation calls by utilizing a posterior odds framework and incorporating local methylation information. This method should be widely applicable to the analyses of methylC-seq data from diverse sparsely methylated genomes. Bis-Class and example data are provided at a dedicated website (http://bibs.snu.ac.kr/software/Bisclass webcite).
【 授权许可】
2014 Huh et al.; licensee BioMed Central Ltd.
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| 20150630145123462.pdf | 2555KB | ||
| Figure 7. | 47KB | Image | |
| Figure 6. | 102KB | Image | |
| Figure 5. | 42KB | Image | |
| Figure 4. | 56KB | Image | |
| Figure 3. | 86KB | Image | |
| Figure 2. | 143KB | Image | |
| Figure 1. | 102KB | Image |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
【 参考文献 】
- [1]Jones PA: Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat Rev Genet 2012, 13(7):484-492.
- [2]Suzuki MM, Bird A: DNA methylation landscapes: provocative insights from epigenomics. Nat Rev Genet 2008, 9:465-476.
- [3]Grunau C, Clark SJ, Rosenthal A: Bisulfite genomic sequencing: systematic investigation of critical experimental parameters. Nucleic Acids Res 2001, 29(13):e65.
- [4]Gao F, Liu XS, Wu X-P, Wang X-L, Gong D, Lu H, Song Y, Wang J, Du J, Liu S, Han X, Tang Y, Yang H, Jin Q, Zhang X, Liu M: Differential DNA methylation in discrete developmental stages of the parasitic nematode Trichinella spiralis. Genome Biol 2012, 13:R100.
- [5]Hunt BG, Glastad K, Yi SV, Goodisman MAD: Patterning and regulatory associations of DNA methylation are mirrored by histone modifications in insects. Genome Biol Evol 2013, 5:591-598.
- [6]Wang X, Wheeler D, Avery A, Rago A, Choi J-H, Colbourne JK, Clark AG, Werren JH: Function and Evolution of DNA Methylation in Nasonia vitripennis. PLoS Genet 2013, 9(10):e1003872.
- [7]Herb BR, Wolschin F, Hansen KD, Aryee MJ, Langmead B, Irizarry R, Amdam GV, Feinberg AP: Reversible switching between epigenetic states in honeybee behavioral subcastes. Nat Neurosci 2012, 15:1371-1373.
- [8]Lyko F, Foret S, Wolf S, Falckenhayn C, Maleszka R: The honey bee epigenomes: differential methylation of brain DNA in queens and workers. PLoS Biol 2010, 8:e1000506.
- [9]Zeng J, Nagrajan HK, Yi SV: Fundamental diversity of human CpG islands at multiple biological levels. Epigenetics 2014, 9(4):483-491.
- [10]Ziller MJ, Gu H, Muller F, Donaghey J, Tsai LTY, Kohlbacher O, De Jager PL, Rosen ED, Bennett DA, Bernstein BE, Gnirke A, Meissner A: Charting a dynamic DNA methylation landscape of the human genome. Nature 2013, 500(7463):477-481.
- [11]Gavery MR, Roberts SB: Predominant intragenic methylation is associated with gene expression characteristics in a bivalve mollusc. PeerJ 2013, 1:e215.
- [12]Vining KJ, Pomraning KR, Wilhelm LJ, Priest HD, Pellegrini M, Mockler TC, Freitag M, Strauss SH: Dynamic DNA cytosine methylation in the Populus trichocarpa genome: tissue-level variation and relationship to gene expression. BMC Genomics 2012, 13:27.
- [13]Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, Tonti-Filippini J, Nery JR, Lee L, Ye Z, Ngo Q-M, Edsall L, Antosiewicz-Bourget J, Stewart R, Ruotti V, Millar AH, Thomson JA, Ren B, Ecker JR: Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 2009, 462(7271):315-322.
- [14]Zemach A, McDaniel IE, Silva P, Zilberman D: Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science 2010, 328:916-919.
- [15]Zeng J, Konopka G, Hunt BG, Preuss TM, Geschwind D, Yi SV: Divergent whole-genome methylation maps of human and chimpanzee brains reveal epigenetic basis of human regulatory evolution. Am J Hum Genet 2012, 91(3):455-465.
- [16]Becker C, Hagmann J, Muller J, Koenig D, Stegle O, Borgwardt K, Weigel D: Spontaneous epigenetic variation in the Arabidopsis thaliana methylome. Nature 2011, 480(7376):245-249.
- [17]Calarco JP, Borges F, Donoghue MT, Van Ex F, Jullien PE, Lopes T, Gardner R, Berger F, Feijo JA, Becker JD, Martienssen RA: Reprogramming of DNA methylation in pollen guides epigenetic inheritance via small RNA. Cell 2012, 151(1):194-205.
- [18]Devroye L, Györfi L, Lugosi G: A probabilistic theory of pattern recognition. New York: Springer-Verlab; 1996.
- [19]Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Statist Soc B 1995, 57:289-300.
- [20]Dinh HQ, Dubin M, Sedlazeck FJ, Lettner N, Mittelsten Scheid O, von Haeseler A: Advanced methylome analysis after bisulfite deep sequencing: an example in Arabidopsis. PLoS One 2012, 7(7):e41528.
- [21]Dempster AP, Laird NM, Rubin DB: Maximum Likelihood from Incomplete Data via the EM Algorithm. J R Stat Soc Ser B 1977, 39(1):1-38.
- [22]Storey JD: The positive false discovery rate: a bayesian interpretation and the q-value. Ann Stat 2003, 31(6):2013-2035.
- [23]Foret S, Kucharski R, Pellegrini M, Feng S, Jacobsen SE, Robinson GE, Maleszka R: DNA methylation dynamics, metabolic fluxes, gene splicing, and alternative phenotypes in honey bees. Proc Natl Acad Sci 2012, 109(13):4968-4973.
- [24]Li-Byarlay H, Li Y, Stroud H, Feng S, Newman TC, Kaneda M, Hou KK, Worley KC, Elsik CG, Wickline SA, Jacobsen SE, Ma J, Robinson GE: RNA interference knockdown of DNA methyl-transferase 3 affects gene alternative splicing in the honey bee. Proc Natl Acad Sci 2013, 110(31):12750-12755.
- [25]Xi Y, Li W: BSMAP: whole genome bisulfite sequence MAPping program. BMC Bioinformatics 2009, 10(1):232.
- [26]Cokus SJ, Feng S, Zhang X, Chen Z, Merriman B, Haudenschild CD, Pradhan S, Nelson SF, Pellegrini M, Jacobsen SE: Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 2008, 452(7184):215-219.
- [27]Eckhardt F, Lewin J, Cortese R, Rakyan VK, Attwood J, Burger M, Burton J, Cox TV, Davies R, Down TA, Haefliger C, Horton R, Howe K, Jackson DK, Kunde J, Koenig C, Liddle J, Niblett D, Otto T, Pettett R, Seemann S, Thompson C, West T, Rogers J, Olek A, Berlin K, Beck S: DNA methylation profiling of human chromosomes 6, 20 and 22. Nat Genet 2006, 38(12):1378-1385.
- [28]Bradley AP: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 1997, 30(7):1145-1159.
- [29]Lister R, O’Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH, Ecker JR: Highly Integrated Single-Base Resolution Maps of the Epigenome in Arabidopsis. Cell 2008, 133(3):523-536.
PDF