Journal of computational biology: A journal of computational molecular cell biology | |
Sequential Integration of Fuzzy Clustering and Expectation Maximization for Transcription Factor Binding Site Identification | |
JinwookChoi^2,31  AliYousefian-Jazi^13  | |
[1] Address correspondence to: Dr. Jinwook Choi, Department of Biomedical Engineering, College of Medicine, Seoul National University, 103 Daehak-ro, Jongno-gu, Seoul 110-799, Korea^2;Department of Biomedical Engineering, College of Medicine, Seoul National University, Seoul, Korea^3;Interdisciplinary Program in Bioengineering, Graduate School, Seoul National University, Seoul, Korea^1 | |
关键词: chromatin immunoprecipitation sequencing; expectation maximization; fuzzy C-means; motif discovery.; | |
DOI : 10.1089/cmb.2017.0230 | |
学科分类:生物科学(综合) | |
来源: Mary Ann Liebert, Inc. Publishers | |
【 摘 要 】
The identification of transcription factor binding sites (TFBSs) is a problem for which computational methods offer great hope. Thus far, the expectation maximization (EM) technique has been successfully utilized in finding TFBSs in DNA sequences, but inappropriate initialization of EM has yielded poor performance or running time scalability under a given data set. In this study, we used a sequential integration approach that defined the final solution as the set of solutions acquired from solving objectives in a cascade manner to integrate the fuzzy C-means and the EM approaches to DNA motif discovery. The new method is explained in detail and tested on the chromatin immunoprecipitation sequencing (ChIP-seq) data sets for different transcription factors (TFs) with various motif patterns. The proposed algorithm also suggests an efficient process for analyzing motif similarity to known motifs as well as finding a target motif. A comparison of results with those of the well-known motif-finding tool, MEME-ChIP, shows the advantages of our proposed framework over this existing tool. Experimental results show that we were able to find the true motifs for all TFs, and that the motifs found by our proposed algorithm were more similar to JASPAR-known motifs for the STAT1, GATA1, and JUN TFs than those found by MEME-ChIP.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201910256017502ZK.pdf | 442KB | download |