期刊论文详细信息
BMC Bioinformatics
Genetic sequence-based prediction of long-range chromatin interactions suggests a potential role of short tandem repeat sequences in genome organization
Research Article
Sarvesh Nikumbh1  Nico Pfeifer2 
[1] Computational Biology & Applied Algorithmics, Max Planck Institute for Informatics, Saarland Informatics Campus, Building E1.4, D-66123, Saarbruecken, Germany;Computational Biology & Applied Algorithmics, Max Planck Institute for Informatics, Saarland Informatics Campus, Building E1.4, D-66123, Saarbruecken, Germany;Present address: Department of Computer Science, University of Tübingen, Sand 14, D-72076, Tübingen, Germany;
关键词: Long-range interactions prediction;    Support vector machines;    Multitask learning;    Hi-C;    Visualizations;   
DOI  :  10.1186/s12859-017-1624-x
 received in 2016-10-18, accepted in 2017-04-05,  发布年份 2017
来源: Springer
PDF
【 摘 要 】

BackgroundKnowing the three-dimensional (3D) structure of the chromatin is important for obtaining a complete picture of the regulatory landscape. Changes in the 3D structure have been implicated in diseases. While there exist approaches that attempt to predict the long-range chromatin interactions, they focus only on interactions between specific genomic regions — the promoters and enhancers, neglecting other possibilities, for instance, the so-called structural interactions involving intervening chromatin.ResultsWe present a method that can be trained on 5C data using the genetic sequence of the candidate loci to predict potential genome-wide interaction partners of a particular locus of interest. We have built locus-specific support vector machine (SVM)-based predictors using the oligomer distance histograms (ODH) representation. The method shows good performance with a mean test AUC (area under the receiver operating characteristic (ROC) curve) of 0.7 or higher for various regions across cell lines GM12878, K562 and HeLa-S3. In cases where any locus did not have sufficient candidate interaction partners for model training, we employed multitask learning to share knowledge between models of different loci. In this scenario, across the three cell lines, the method attained an average performance increase of 0.09 in the AUC. Performance evaluation of the models trained on 5C data regarding prediction on an independent high-resolution Hi-C dataset (which is a rather hard problem) shows 0.56 AUC, on average. Additionally, we have developed new, intuitive visualization methods that enable interpretation of sequence signals that contributed towards prediction of locus-specific interaction partners. The analysis of these sequence signals suggests a potential general role of short tandem repeat sequences in genome organization.ConclusionsWe demonstrated how our approach can 1) provide insights into sequence features of locus-specific interaction partners, and 2) also identify their cell-line specificity. That our models deem short tandem repeat sequences as discriminative for prediction of potential interaction partners, suggests that they could play a larger role in genome organization. Thus, our approach can (a) be beneficial to broadly understand, at the sequence-level, chromatin interactions and higher-order structures like (meta-) topologically associating domains (TADs); (b) study regions omitted from existing prediction approaches using various information sources (e.g., epigenetic information); and (c) improve methods that predict the 3D structure of the chromatin.

【 授权许可】

CC BY   
© The Author(s) 2017

【 预 览 】
附件列表
Files Size Format View
RO202311090663611ZK.pdf 2052KB PDF download
12864_2017_4132_Article_IEq6.gif 1KB Image download
12864_2017_4030_Article_IEq28.gif 1KB Image download
12864_2017_4226_Article_IEq2.gif 1KB Image download
12864_2017_4004_Article_IEq16.gif 1KB Image download
12864_2017_4030_Article_IEq29.gif 1KB Image download
12864_2016_3098_Article_IEq70.gif 1KB Image download
12864_2016_2880_Article_IEq26.gif 1KB Image download
12864_2017_3916_Article_IEq1.gif 1KB Image download
12864_2015_2174_Article_IEq1.gif 1KB Image download
12864_2015_2192_Article_IEq19.gif 1KB Image download
12864_2016_2796_Article_IEq24.gif 1KB Image download
12864_2016_2796_Article_IEq25.gif 1KB Image download
12864_2017_4132_Article_IEq18.gif 1KB Image download
12888_2016_877_Article_IEq22.gif 1KB Image download
12914_2017_113_Article_IEq1.gif 1KB Image download
12864_2017_3487_Article_IEq31.gif 1KB Image download
【 图 表 】

12864_2017_3487_Article_IEq31.gif

12914_2017_113_Article_IEq1.gif

12888_2016_877_Article_IEq22.gif

12864_2017_4132_Article_IEq18.gif

12864_2016_2796_Article_IEq25.gif

12864_2016_2796_Article_IEq24.gif

12864_2015_2192_Article_IEq19.gif

12864_2015_2174_Article_IEq1.gif

12864_2017_3916_Article_IEq1.gif

12864_2016_2880_Article_IEq26.gif

12864_2016_3098_Article_IEq70.gif

12864_2017_4030_Article_IEq29.gif

12864_2017_4004_Article_IEq16.gif

12864_2017_4226_Article_IEq2.gif

12864_2017_4030_Article_IEq28.gif

12864_2017_4132_Article_IEq6.gif

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  文献评价指标  
  下载次数:6次 浏览次数:10次