| BMC Genomics | |
| MicroRNA target prediction using thermodynamic and sequence curves | |
| Research Article | |
| Somali Chaterji1  Ananth Grama1  Raghavendran Shankar1  Asish Ghoshal1  Saurabh Bagchi2  | |
| [1] Department of Computer Science, Purdue University, 47907, West Lafayette, IN, USA;School of Electrical and Computer Engineering, Purdue University, 47907, West Lafayette, IN, USA; | |
| 关键词: Target Site; miRNA Target; miRNA Target Prediction; Seed Match; Mouse Dataset; | |
| DOI : 10.1186/s12864-015-1933-2 | |
| received in 2015-04-27, accepted in 2015-09-09, 发布年份 2015 | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
BackgroundMicroRNAs (miRNAs) are small regulatory RNA that mediate RNA interference by binding to various mRNA target regions. There have been several computational methods for the identification of target mRNAs for miRNAs. However, these have considered all contributory features as scalar representations, primarily, as thermodynamic or sequence-based features. Further, a majority of these methods solely target canonical sites, which are sites with “seed” complementarity. Here, we present a machine-learning classification scheme, titled Avishkar, which captures the spatial profile of miRNA-mRNA interactions via smooth B-spline curves, separately for various input features, such as thermodynamic and sequence features. Further, we use a principled approach to uniformly model canonical and non-canonical seed matches, using a novel seed enrichment metric.ResultsWe demonstrate that large number of seed-match patterns have high enrichment values, conserved across species, and that majority of miRNA binding sites involve non-canonical matches, corroborating recent findings. Using spatial curves and popular categorical features, such as target site length and location, we train a linear SVM model, utilizing experimental CLIP-seq data. Our model significantly outperforms all established methods, for both canonical and non-canonical sites. We achieve this while using a much larger candidate miRNA-mRNA interaction set than prior work.ConclusionsWe have developed an efficient SVM-based model for miRNA target prediction using recent CLIP-seq data, demonstrating superior performance, evaluated using ROC curves, specifically about 20 % better than the state-of-the-art, for different species (human or mouse), or different target types (canonical or non-canonical). To the best of our knowledge we provide the first distributed framework for microRNA target prediction based on Apache Hadoop and Spark.AvailabilityAll source code and data is publicly available at https://bitbucket.org/cellsandmachines/avishkar.
【 授权许可】
CC BY
© Ghoshal et al. 2016
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202311098286101ZK.pdf | 1619KB | ||
| 12864_2017_3676_Article_IEq2.gif | 1KB | Image | |
| 12888_2017_1365_Article_IEq6.gif | 1KB | Image | |
| 12864_2015_1933_Article_IEq3.gif | 1KB | Image |
【 图 表 】
12864_2015_1933_Article_IEq3.gif
12888_2017_1365_Article_IEq6.gif
12864_2017_3676_Article_IEq2.gif
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
- [27]
- [28]
- [29]
- [30]
- [31]
- [32]
- [33]
- [34]
- [35]
- [36]
- [37]
- [38]
- [39]
- [40]
- [41]
- [42]
- [43]
- [44]
- [45]
- [46]
- [47]
- [48]
- [49]
- [50]
- [51]
- [52]
- [53]
- [54]
- [55]
- [56]
- [57]
- [58]
- [59]
- [60]
- [61]
- [62]
- [63]
- [64]
- [65]
- [66]
- [67]
- [68]
- [69]
- [70]
- [71]
- [72]
- [73]
PDF