期刊论文详细信息
Biology Direct
RNA motif discovery: a computational overview
Avinash Achar1  Pål Sætrom2 
[1] Department of Computer and Information Science, Norwegian University of Science and Technology, Trondheim, Norway
[2] Department of Cancer Research and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
关键词: Motif discovery;    Secondary structure;    RNA;   
Others  :  1229678
DOI  :  10.1186/s13062-015-0090-5
 received in 2015-05-16, accepted in 2015-10-01,  发布年份 2015
PDF
【 摘 要 】

Genomic studies have greatly expanded our knowledge of structural non-coding RNAs (ncRNAs). These RNAs fold into characteristic secondary structures and perform specific-structure dependent biological functions. Hence RNA secondary structure prediction is one of the most well studied problems in computational RNA biology. Comparative sequence analysis is one of the more reliable RNA structure prediction approaches as it exploits information of multiple related sequences to infer the consensus secondary structure. This class of methods essentially learns a global secondary structure from the input sequences. In this paper, we consider the more general problem of unearthing common local secondary structure based patterns from a set of related sequences. The input sequences for example could correspond to 3 or 5 untranslated regions of a set of orthologous genes and the unearthed local patterns could correspond to regulatory motifs found in these regions. These sequences could also correspond to in vitro selected RNA, genomic segments housing ncRNA genes from the same family and so on. Here, we give a detailed review of the various computational techniques proposed in literature attempting to solve this general motif discovery problem. We also give empirical comparisons of some of the current state of the art methods and point out future directions of research.

【 授权许可】

   
2015 Achar and Sætrom.

【 预 览 】
附件列表
Files Size Format View
20151031010816215.pdf 1677KB PDF download
Fig. 16. 37KB Image download
Figure 3. 105KB Image download
Fig. 14. 31KB Image download
Fig. 13. 43KB Image download
Fig. 12. 59KB Image download
Fig. 11. 41KB Image download
Fig. 10. 34KB Image download
Fig. 9. 23KB Image download
Fig. 8. 54KB Image download
Fig. 7. 21KB Image download
Fig. 6. 79KB Image download
Fig. 5. 16KB Image download
Fig. 4. 48KB Image download
Fig. 3. 32KB Image download
Fig. 2. 44KB Image download
Fig. 1. 59KB Image download
【 图 表 】

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 4.

Fig. 5.

Fig. 6.

Fig. 7.

Fig. 8.

Fig. 9.

Fig. 10.

Fig. 11.

Fig. 12.

Fig. 13.

Fig. 14.

Figure 3.

Fig. 16.

【 参考文献 】
  • [1]Storz G: An expanding universe of non-coding RNAs. Science 2002, 296(5571):1260-3.
  • [2]Kapranov P, Willingham AT, Gingeras TR: Genome-wide transcription and the implications for genomic organization. Nat Rev Genet 2007, 8(6):413-23.
  • [3]Mercer TR, Dinger ME, Mattick JS: Long non-coding RNAs: insights into functions. Nat Rev Genet 2009, 10(3):155-9.
  • [4]Washietl S, Will S, Hendrix DA, Goff LA, Rinn JL, Berger B, et al.: Computational analysis of noncoding RNAs. Wiley Interdiscip Rev: RNA 2012, 3(6):759-78.
  • [5]Nussinov R, Pieczenik G, Griggs JR, Kleitman DJ: Algorithms for loop matchings. SIAM J Appl Math 1978, 35(1):68-82.
  • [6]Zuker M, Stiegler P: Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Research 1981, 9(1):133-48.
  • [7]Do CB, Woods DA, Batzoglou S: CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics 2006, 22(1):90-8.
  • [8]Gardner PP, Giegerich R: A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics 2004, 5:140. BioMed Central Full Text
  • [9]Machado-Lima A, Portillo H, Durham A: Computational methods in noncoding RNA research. J Math Biol 2008, 56(1-2):15-49.
  • [10]Garst AD, Edwards AL, Batey RT: Riboswitches: Structures and mechanisms. Cold Spring Harbor Perspect Biol 2011, 3(6):003533.
  • [11]Tuerk C, Gold L: Systematic evolution of ligands by exponential enrichment - RNA ligands to bacteriophage-T4 DNA-polymerase. Science 1990, 249(4968):505-10.
  • [12]Bernhart SH, Hofacker IL: From consensus structure prediction to RNA gene finding. Brief Funct Genomics Proteomics 2009, 8(6):461-71.
  • [13]Gorodkin J, Hofacker IL: From structure prediction to genomic screens for novel non-coding RNAs. PLoS Comput Biol 2011, 7(8):1002100.
  • [14]Gorodkin J, Hofacker IL, Torarinsson E, Yao Z, Havgaard JH, Ruzzo WL: De novo prediction of structured RNAs from genomic sequences. Trends Biotechnol 2010, 28(1):9-19.
  • [15]Rivas E, Eddy SR: Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics 2001, 2:8. BioMed Central Full Text
  • [16]Pedersen JS, Bejerano G, Siepel AC, Rosenbloom KR, Lindblad-Toh K, Lander ES, et al.: Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput Biol. 2006, 2(4):33.
  • [17]Washietl S, Hofacker IL, Stadler PF: Fast and reliable prediction of noncoding RNAs. Proc Natl Acad Sci of the U S A 2005, 102(7):2454-9.
  • [18]Menzel P, Gorodkin J, Stadler PF: The tedious task of finding homologous noncoding RNA genes. RNA 2009, 15(12):2075-82.
  • [19]Shapiro BA: An algorithm for comparing multiple RNA secondary structures. Comput Appl Biosci 1988, 4(3):387-93.
  • [20]Fontana W, Konings DAM, Stadler PF, Schuster P: Statistics of RNA secondary structures. Biopolymers 1993, 33:1389-404.
  • [21]Gan HH, Pasquali S, Schlick T: Exploring the repertoire of RNA secondary motifs using graph theory; implications for RNA design. Nucleic Acids Res 2003, 31(11):2926-43.
  • [22]Turner DH, Sugimoto N, Freier SM: RNA structure prediction. Annu Rev Biophys Biophys Chem 1988, 17(1):167-92.
  • [23]Tinoco I, Uhlenbeck OC, Levine MD: Estimation of secondary structure in ribonucleic acids. Nature 1971, 230(5293):362-7.
  • [24]Mathews DH, Disney MD, Childs JL, Schroeder SJ, Zuker M, Turner DH: Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc Natl Acad Sci U S A 2004, 101(19):7287-92.
  • [25]McCaskill J: The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 1990, 29(6-7):1105-19.
  • [26]Churkin A, Barash D: RNA dot plots: an image representation for RNA secondary structure analysis and manipulations. Wiley Interdiscip Rev: RNA 2013, 4(2):205-16.
  • [27]Lorenz R, Bernhart SH, Höener Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, et al.: ViennaRNA package 2.0. Algorithms Mol Biol 2011, 6(1):26. BioMed Central Full Text
  • [28]Layton DM, Bundschuh R: A statistical analysis of RNA folding algorithms through thermodynamic parameter perturbation. Nucleic Acids Res 2005, 33(2):519-24.
  • [29]Chan C, Ding Y: Boltzmann ensemble features of RNA secondary structures: a comparative analysis of biological RNA sequences and random shuffles. J Math Biol 2008, 56(1-2):93-105.
  • [30]Bailey TL, Elkan C: Unsupervised learning of multiple motifs in biopolymers using expectation maximization. Mach Learn 1995, 21(1-2):51-80.
  • [31]Hiller M, Pudimat R, Busch A, Backofen R: Using RNA secondary structures to guide sequence motif finding towards single-stranded regions. Nucleic Acids Res 2006, 34(17):7.
  • [32]Yao Z, Weinberg Z, Ruzzo WL: CMfinder–a covariance model based RNA motif finding algorithm. Bioinformatics 2006, 22(4):445-52.
  • [33]Durbin R, Eddy S, Krogh A, Mitchison G: Biological sequence analysis: probabilistic models of proteins and Nucleic Acids. Cambridge University Press, Cambridge; 1998.
  • [34]Eddy SR, Durbin R: RNA sequence analysis using covariance models. Nucleic Acids Res 1994, 22(11):2079-88.
  • [35]Rabani M, Kertesz M, Segal E: Computational prediction of RNA structural motifs involved in posttranscriptional regulatory processes. Proc Natl Acad Sci 2008, 105(39):14885-90.
  • [36]Sakakibara Y, Brown M, Hughey R, Mian IS, Sjölander K, Underwood RC, Haussler D: Recent methods for RNA modeling using stochastic context-free grammars. In Combinatorial Pattern Matching.. Springer, Berlin Heidelberg; 1994.
  • [37]Ji Y, Xu X, Stormo GD: A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences. Bioinformatics 2004, 20(10):1591-602.
  • [38]Hamada M, Tsuda K, Kudo T, Kin T, Asai K: Mining frequent stem patterns from unaligned RNA sequences. Bioinformatics 2006, 22(20):2480-7.
  • [39]Han J, Cheng H, Xin D, Yan X: Frequent pattern mining: current status and future directions. Data Min Knowl Disc 2007, 15(1):55-86.
  • [40]Gorodkin J, Heyer LJ, Stormo GD: Finding the most significant common sequence and structure motifs in a set of RNA sequences. Nucleic Acids Res 1997, 25(18):3724-32.
  • [41]Sankoff D: Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J Appl Math 1985, 45(5):810-25.
  • [42]Gorodkin J, Lyngsø RB, Stormo GD: A mini-greedy algorithm for faster structural RNA stem-loop search. Genome Inform Ser Workshop Genome Inform 2001, 12:184-93.
  • [43]Gorodkin J, Stricklin SL, Stormo GD: Discovering common stem-loop motifs in unaligned RNA sequences. Nucleic Acids Researc 2001, 29(10):2135-44.
  • [44]Havgaard JH, Lyngsø RB, Stormo GD, Gorodkin J: Pairwise local structure alignment of RNA sequences with sequence similarity less than 40 %. Bioinformatics 2005, 21(9):1815-24.
  • [45]Tabei Y, Asai K: A local multiple alignment method for detection of non-coding RNA sequences. Bioinformatics 2009, 25(12):1498-505.
  • [46]Phuong TM, Do CB, Edgar RC, Batzoglou S: Multiple alignment of protein sequences with repeats and rearrangements. Nucleic Acids Res 2006, 34(20):5932-42.
  • [47]Tabei Y, Kiryu H, Kin T, Asai K: A fast structural multiple alignment method for long RNA sequences. BMC Bioinformatics 2008, 9:33. BioMed Central Full Text
  • [48]Liu J, Wang JT-L, Hu J, Tian B: A method for aligning RNA secondary structures and its application to RNA motif detection. BMC Bioinformatics 2005, 6:89. BioMed Central Full Text
  • [49]Höchsmann M, Töller T, Giegerich R, Kurtz S: Local similarity in RNA secondary structures. In Proceedings of the IEEE Computer Society Conference on Bioinformatics. CSB ’03. IEEE Computer Society, Washington DC; 2003.
  • [50]Jiang T, Wang L, Zhang K: Alignment of trees - an alternative to tree edit. Theor Comput Sci 1995, 143(1):137-48.
  • [51]Bille P: A survey on tree edit distance and related problems. Theor Comput Sci 2005, 337(1):217-39.
  • [52]Höchsmann M, Voss B, Giegerich R: Pure multiple RNA secondary structure alignments: A progressive profile approach. IEEE/ACM Trans Comput Biol Bioinformatics 2004, 1(1):53-62.
  • [53]Backofen R, Will S: Local sequence-structure motifs in RNA. J Bioinformatics Comput Biol 2004, 2(4):681-98.
  • [54]Backofen R, Siebert S: Fast detection of common sequence structure patterns in RNAs. J Discrete Algorithms 2007, 5(2):212-28.
  • [55]Zaki MJ: Efficiently mining frequent trees in a forest: algorithms and applications. IEEE Trans Knowl Data Eng 2005, 17(8):1021-35.
  • [56]Pavesi G, Mauri G, Stefani M, Pesole G: RNAProfile: an algorithm for finding conserved secondary structure motifs in unaligned RNA sequences. Nucleic Acids Res 2004, 32(10):3258-69.
  • [57]Hu Y: Prediction of consensus structural motifs in a family of coregulated RNA sequences. Nucleic Acids Res 2002, 30(17):3886-93.
  • [58]Hu Y: GPRM: a genetic programming approach to finding common RNA secondary structure elements. Nucleic Acids Res 2003, 31(13):3446-9.
  • [59]Michal S, Ivry T, Schalit-Cohen O, Sipper M, Barash D: Finding a common motif of RNA sequences using genetic programming: The GeRNAMo system. IEEE/ACM Trans Comput Biol Bioinformatics 2007, 4(4):596-610.
  • [60]Wuchty S, Fontana W, Hofacker IL, Schuster P: Complete suboptimal folding of RNA and the stability of secondary structures. Biopolymers 1999, 49(2):145-65.
  • [61]Burge SW, Daub J, Eberhardt RY, Tate JG, Barquist L, Nawrocki EP, et al.: Rfam 11.0: 10 years of RNA families. Nucleic Acids Res 2013, 41(D1):226-32.
  • [62]Kwok CK, Tang Y, Assmann SM, Bevilacqua PC: The RNA structurome: transcriptome-wide structure probing with next-generation sequencing. Trends Biochem Sci 2015, 40(4):221-32.
  • [63]Li F, Ryvkin P, Childress DM, Valladares O, Gregory BD, Wang LS: SAVoR: a server for sequencing annotation and visualization of RNA structures. Nucleic Acids Res 2012, 40(W1):59-64.
  • [64]Deigan KE, Li TW, Mathews DH, Weeks KM: Accurate SHAPE-directed RNA structure determination. Proc Natl Acad Sci 2009, 106(1):97-102.
  文献评价指标  
  下载次数:51次 浏览次数:3次