期刊论文详细信息
BMC Bioinformatics
HOME: a histogram based machine learning approach for effective identification of differentially methylated regions
Steven R. Eichten1  Justin O. Borevitz1  Akanksha Srivastava2  Yuliya V. Karpievitch3  Ryan Lister3 
[1] 0000 0001 2180 7477, grid.1001.0, ARC Centre of Excellence in Plant Energy Biology, The Australian National University, Canberra, Australia;0000 0004 1936 7910, grid.1012.2, ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, Australia;0000 0004 1936 7910, grid.1012.2, ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, Australia;0000 0004 0469 0045, grid.431595.f, Harry Perkins Institute of Medical Research, Perth, Australia;
关键词: Whole genome bisulfite sequencing;    DNA methylation;    Epigenetics;    DMR identification;    SVM;   
DOI  :  10.1186/s12859-019-2845-y
来源: publisher
PDF
【 摘 要 】

BackgroundThe development of whole genome bisulfite sequencing has made it possible to identify methylation differences at single base resolution throughout an entire genome. However, a persistent challenge in DNA methylome analysis is the accurate identification of differentially methylated regions (DMRs) between samples. Sensitive and specific identification of DMRs among different conditions requires accurate and efficient algorithms, and while various tools have been developed to tackle this problem, they frequently suffer from inaccurate DMR boundary identification and high false positive rate.ResultsWe present a novel Histogram Of MEthylation (HOME) based method that takes into account the inherent difference in the distribution of methylation levels between DMRs and non-DMRs to discriminate between the two using a Support Vector Machine. We show that generated features used by HOME are dataset-independent such that a classifier trained on, for example, a mouse methylome training set of regions of differentially accessible chromatin, can be applied to any other organism’s dataset and identify accurate DMRs. We demonstrate that DMRs identified by HOME exhibit higher association with biologically relevant genes, processes, and regulatory events compared to the existing methods. Moreover, HOME provides additional functionalities lacking in most of the current DMR finders such as DMR identification in non-CG context and time series analysis. HOME is freely available at https://github.com/ListerLab/HOME.ConclusionHOME produces more accurate DMRs than the current state-of-the-art methods on both simulated and biological datasets. The broad applicability of HOME to identify accurate DMRs in genomic data from any organism will have a significant impact upon expanding our knowledge of how DNA methylation dynamics affect cell development and differentiation.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202004231697585ZK.pdf 5801KB PDF download
  文献评价指标  
  下载次数:17次 浏览次数:2次