期刊论文

【摘要】

BackgroundWith the development of chromosomal conformation capturing techniques, particularly, the Hi-C technique, the study of the spatial conformation of a genome is becoming an important topic in bioinformatics and computational biology. The Hi-C technique can generate genome-wide chromosomal interaction (contact) data, which can be used to investigate the higher-level organization of chromosomes, such as Topologically Associated Domains (TAD), i.e., locally packed chromosome regions bounded together by intra chromosomal contacts. The identification of the TADs for a genome is useful for studying gene regulation, genomic interaction, and genome function.ResultsHere, we formulate the TAD identification problem as an unsupervised machine learning (clustering) problem, and develop a new TAD identification method called ClusterTAD. We introduce a novel method to represent chromosomal contacts as features to be used by the clustering algorithm. Our results show that ClusterTAD can accurately predict the TADs on a simulated Hi-C data. Our method is also largely complementary and consistent with existing methods on the real Hi-C datasets of two mouse cells. The validation with the chromatin immunoprecipitation (ChIP) sequencing (ChIP-Seq) data shows that the domain boundaries identified by ClusterTAD have a high enrichment of CTCF binding sites, promoter-related marks, and enhancer-related histone modifications.ConclusionsAs ClusterTAD is based on a proven clustering approach, it opens a new avenue to apply a large array of clustering methods developed in the machine learning field to the TAD identification problem. The source code, the results, and the TADs generated for the simulated and real Hi-C datasets are available here: https://github.com/BDM-Lab/ClusterTAD.

【授权许可】

CC BY
© The Author(s). 2017

【预览】

附件列表
Files	Size	Format	View
RO202311102349646ZK.pdf	3773KB	PDF	download
MediaObjects/13395_2023_326_MOESM1_ESM.docx	300KB	Other	download
12951_2015_155_Article_IEq39.gif	1KB	Image	download

【图表】

12951_2015_155_Article_IEq39.gif

【参考文献】

[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]

BMC Bioinformatics
ClusterTAD: an unsupervised machine learning approach to detecting topologically associated domains of chromosomes from Hi-C data
Research Article
Oluwatosin Oluwadare¹ Jianlin Cheng²
[1] Electrical Engineering and Computer Science Department, University of Missouri, 65211, Columbia, MO, USA;Electrical Engineering and Computer Science Department, University of Missouri, 65211, Columbia, MO, USA;Informatics Institute, University of Missouri, 65211, Columbia, MO, USA;
关键词: Clustering; Hi-C; Topologically associated domain (TAD); CTCF; Chromosome conformation capturing; Genome structure; Chromosome organization;
DOI : 10.1186/s12859-017-1931-2
received in 2017-07-14, accepted in 2017-11-06, 发布年份 2017
来源: Springer
PDF


	文献评价指标
	下载次数：5次	浏览次数：0次

【 摘 要 】

【 授权许可】

【 预 览 】

【 图 表 】

【 参考文献 】

【摘要】

【授权许可】

【预览】

【图表】

【参考文献】