期刊论文详细信息
BMC Bioinformatics
A fast and robust iterative algorithm for prediction of RNA pseudoknotted secondary structures
Hosna Jabbari1  Anne Condon1 
[1] Department of Computer Science, University of British Columbia, 2366 Main Mall, Vancouver, Canada
关键词: Minimum free energy;    Hierarchical folding;    Pseudoknot;    Secondary structure prediction;    RNA;   
Others  :  818567
DOI  :  10.1186/1471-2105-15-147
 received in 2014-01-07, accepted in 2014-05-08,  发布年份 2014
PDF
【 摘 要 】

Background

Improving accuracy and efficiency of computational methods that predict pseudoknotted RNA secondary structures is an ongoing challenge. Existing methods based on free energy minimization tend to be very slow and are limited in the types of pseudoknots that they can predict. Incorporating known structural information can improve prediction accuracy; however, there are not many methods for prediction of pseudoknotted structures that can incorporate structural information as input. There is even less understanding of the relative robustness of these methods with respect to partial information.

Results

We present a new method, Iterative HFold, for pseudoknotted RNA secondary structure prediction. Iterative HFold takes as input a pseudoknot-free structure, and produces a possibly pseudoknotted structure whose energy is at least as low as that of any (density-2) pseudoknotted structure containing the input structure. Iterative HFold leverages strengths of earlier methods, namely the fast running time of HFold, a method that is based on the hierarchical folding hypothesis, and the energy parameters of HotKnots V2.0.

Our experimental evaluation on a large data set shows that Iterative HFold is robust with respect to partial information, with average accuracy on pseudoknotted structures steadily increasing from roughly 54% to 79% as the user provides up to 40% of the input structure.

Iterative HFold is much faster than HotKnots V2.0, while having comparable accuracy. Iterative HFold also has significantly better accuracy than IPknot on our HK-PK and IP-pk168 data sets.

Conclusions

Iterative HFold is a robust method for prediction of pseudoknotted RNA secondary structures, whose accuracy with more than 5% information about true pseudoknot-free structures is better than that of IPknot, and with about 35% information about true pseudoknot-free structures compares well with that of HotKnots V2.0 while being significantly faster. Iterative HFold and all data used in this work are freely available at http://www.cs.ubc.ca/~hjabbari/software.php webcite.

【 授权许可】

   
2014 Jabbari and Condon; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140711113927724.pdf 549KB PDF download
Figure 3. 39KB Image download
Figure 2. 131KB Image download
Figure 1. 50KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

【 参考文献 】
  • [1]Hale BJ, Yang C-X, Ross JW: Small RNA regulation of reproductive function. Mol Reprod Dev 2014, 81(2):148-159.
  • [2]Deryusheva S, Gall JG: Novel small cajal-body-specific RNAs identified in drosophila: probing guide RNA function. RNA 2013, 19(12):1802-1814.
  • [3]Holt CE, Schuman EM: The central dogma decentralized: New perspectives on RNA function and local translation in neurons. Neuron 2013, 80(3):648-657.
  • [4]Mattick JS, Makunin IV: Non-coding RNA. Hum Mol Genet 2006, 15(suppl 1):17-29.
  • [5]Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest ARR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, The FANTOM Consortium, et al.: The transcriptional landscape of the mammalian genome. Science 2005, 309(5740):1559-1563.
  • [6]Dennis C: The brave new world of RNA. Nature 2002, 418(6894):122-124.
  • [7]Lee K, Varma S, Santalucia J, Cunningham PR: In vivo determination of RNA structure-function relationships: analysis of the 790 loop in ribosomal RNA. J Mol Biol 1997, 269(5):732-743.
  • [8]Abdi NM, Fredrick K: Contribution of 16S rRNA nucleotides forming the 30S subunit a and p sites to translation in escherichia coli. RNA 2005, 11(11):1624-1632.
  • [9]Saraiya AA, Lamichhane TN, Chow CS, SantaLucia J, Cunningham PR: Identification and role of functionally important motifs in the 970 loop of escherichia coli 16S ribosomal RNA. J Mol Biol 2008, 376(3):645-657.
  • [10]Calidas D, Lyon H, Culver GM: The N-terminal extension of S12 influences small ribosomal subunit assembly in Escherichia coli. RNA 2014. [http://dx.doi.org/10.1261/rna.042432.113 webcite]
  • [11]Sato K, Kato Y, Akutsu T, Asai K, Sakakibara Y: DAFS: simultaneous aligning and folding of RNA sequences via dual decomposition. Bioinformatics 2012, 28(24):3218-3224.
  • [12]Hamada M, Sato K, Asai K: Improving the accuracy of predicting secondary structure for aligned RNA sequences. Nucleic Acids Res 2011, 39(2):393-402.
  • [13]Hamada M, Yamada K, Sato K, Frith MC, Asai K: CentroidHomfold-LAST: accurate prediction of RNA secondary structure using automatically collected homologous sequences. Nucleic Acids Res 2011, 39(suppl 2):100-106.
  • [14]Xu Z, Mathews DH: Multilign: an algorithm to predict secondary structures conserved in multiple RNA sequences. Bioinformatics 2011, 27(5):626-632.
  • [15]Wiebe NJP, Meyer IM: Transat - a method for detecting the conserved helices of functional rna structures, including transient, pseudo-knotted and alternative structures. PLoS Comput Biol 2010, 6(6):1000823.
  • [16]Bernhart S, Hofacker I, Will S, Gruber A, Stadler P: RNAalifold: improved consensus structure prediction for RNA alignments. BMC Bioinformatics 2008, 9(1):474. BioMed Central Full Text
  • [17]Meyer IM, Miklós I: SimulFold: simultaneously inferring RNA structures including pseudoknots, alignments, and trees using a Bayesian MCMC framework. PLoS Comput Biol 2007, 3(8):149.
  • [18]Pedersen JS, Bejerano G, Siepel A, Rosenbloom K, Lindblad-Toh K, Lander ES, Kent J, Miller W, Haussler D: Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput Biol 2006, 2(4):33.
  • [19]Griffiths-Jones S, Moxon S, Marshall M, Khanna A, Eddy SR, Bateman A: Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res 2005., 33(Database issue) [http://view.ncbi.nlm.nih.gov/pubmed/15608160 webcite]
  • [20]Touzet H, Perriquet O: CARNAC: folding families of related RNAs. Nucleic Acids Res 2004., 32(Web Server issue) [http://dx.doi.org/10.1093/nar/gkh415 webcite]
  • [21]Knudsen B, Hein J: RNA secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics 1999, 15(6):446-454.
  • [22]Durbin R, Eddy SR, Krogh A, Mitchison G: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge: Cambridge University Press; 1998.
  • [23]Mathews DH, Sabina J, Zuker M, Turner DH: Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J Mol Biol 1999, 288(5):911-940.
  • [24]The ENCODE Project Consortium: Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007, 447(7146):799-816.
  • [25]Hofacker IL, Fontana W, Stadler PF, Bonhoeffer LS, Tacker M, Schuster P: Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie / Chem Monthly 1994, 125(2):167-188. [http://dx.doi.org/10.1007/bf00818163 webcite]
  • [26]Proctor JR, Meyer IM: CoFold: an RNA secondary structure prediction method that takes co-transcriptional folding into account. Nucleic Acids Res 2013, 41(9):102.
  • [27]Staple DW, Butcher SE: Pseudoknots: RNA structures with diverse functions. PLoS Biol 2005, 3(6):e213+. [http://dx.doi.org/10.1371/journal.pbio.0030213 webcite]
  • [28]van Batenburg FH, Gultyaev AP, Pleij CW: Pseudobase: structural information on RNA pseudoknots. Nucleic Acids Res 2001, 29(1):194-195.
  • [29]Deiman BALM, Pleij CWA: Pseudoknots: A vital feature in viral RNA. Semin Virol 1997,s, 8(3):166-175.
  • [30]Akutsu T: Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots. Disc App Math 2000, 104(1–3):45-62.
  • [31]Lyngsø RB: Complexity of pseudoknot prediction in simple models. In ICALP. Automata, Languages and Programming. Lecture Notes in Computer Science, vol. 3142. Edited by Díaz J, Karhumäki J, Lepistö A, Sannella D. Heidelberg: Springer Berlin; 2004:919-931.
  • [32]Pedersen CN, Lyngsø RB: RNA pseudoknot prediction in energy-based models. J Comput Biol 2000, 7(3–4):409-427.
  • [33]Rivas E, Eddy SR: A dynamic programming algorithm for RNA structure prediction including pseudoknots. J Mol Biol 1999, 285(5):2053-2068.
  • [34]Dirks RM, Pierce NA: A partition function algorithm for nucleic acid secondary structure including pseudoknots. J Comput Chem 2003, 24(13):1664-1677.
  • [35]Reeder J, Giegerich R: Design, implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics. BMC Bioinformatics 2004, 5:104+. [http://dx.doi.org/10.1186/1471-2105-5-104 webcite] BioMed Central Full Text
  • [36]Andronescu MS, Pop C, Condon AE: Improved free energy parameters for RNA pseudoknotted secondary structure prediction. RNA 2010, 16(1):26-42.
  • [37]Sperschneider J, Datta A, Wise MJ: Heuristic RNA pseudoknot prediction including intramolecular kissing hairpins. RNA 2011, 17(1):27-38.
  • [38]Sperschneider J, Datta A: DotKnot: pseudoknot prediction using the probability dot plot under a refined energy model. Nucleic Acids Res 2010, 38(7):103.
  • [39]Sperschneider J, Datta A: KnotSeeker: Heuristic pseudoknot detection in long RNA sequences. RNA 2008, 14(4):630-640.
  • [40]Huang C-H, Lu CL, Chiu H-T: A heuristic approach for detecting RNA h-type pseudoknots. Bioinformatics 2005, 21(17):3501-3508.
  • [41]Ren J, Rastegari B, Condon A, Hoos HH: Hotknots: Heuristic prediction of rna secondary structures including pseudoknots. RNA 2005, 11(10):1494-1504.
  • [42]Sato K, Kato Y, Hamada M, Akutsu T, Asai K: IPknot: fast and accurate prediction of RNA secondary structures with pseudoknots using integer programming. Bioinformatics 2011, 27(13):85-93.
  • [43]Mathews DH: Using an RNA secondary structure partition function to determine confidence in base pairs predicted by free energy minimization. RNA 2004, 10(8):1178-1190.
  • [44]Puton T, Kozlowski LP, Rother KM, Bujnicki JM: CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction. Nucleic Acids Res 2013, 41(7):4307-4323.
  • [45]Mathews DH, Disney MD, Childs JL, Schroeder SJ, Zuker M, Turner DH: Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc Natl Acad Sci U S A 2004, 101(19):7287-7292.
  • [46]Deigan KE, Li TW, Mathews DH, Weeks KM: Accurate SHAPE-directed RNA structure determination. Proc Natl Acad Sci 2009,s, 106(1):97-102.
  • [47]Hajdin CE, Bellaousov S, Huggins W, Leonard CW, Mathews DH, Weeks KM: Accurate SHAPE-directed RNA secondary structure modeling, including pseudoknots. Proc Natl Acad Sci U S A 2013, 110(14):5498-5503.
  • [48]Jabbari H, Condon A, Zhao S: Novel and efficient RNA secondary structure prediction using hierarchical folding. J Comput Biol 2008, 15(2):139-163.
  • [49]Tinoco I, Bustamante C: How RNA folds. J Mol Biol 1999, 293(2):271-281.
  • [50]Mathews DH: Predicting RNA secondary structure by free energy minimization. Theor Chem Acc: Theory, Computation, and Modeling (Theoretica Chimica Acta) 2006, 1-9.
  • [51]Cho SS, Pincus DL, Thirumalai D: Assembly mechanisms of RNA pseudoknots are determined by the stabilities of constituent secondary structures. Proc Natl Acad Sci 2009, 106(41):17349-17354.
  • [52]Bailor MH, Sun X, Al-Hashimi HM: Topology links RNA secondary structure with global conformation, dynamics, and adaptation. Science 2010, 327(5962):202-206. [http://dx.doi.org/10.1126/science.1181085 webcite]
  • [53]Wilkinson KA, Merino EJ, Weeks KM: RNA SHAPE chemistry reveals nonhierarchical interactions dominate equilibrium structural transitions in tRNAasp transcripts. J Am Chem Soc 2005, 127(13):4659-4667.
  • [54]Ding F, Sharma S, Chalasani P, Demidov VV, Broude NE, Dokholyan NV: Ab initio RNA folding by discrete molecular dynamics: From structure prediction to folding mechanisms. RNA 2008, 14(6):1164-1173.
  • [55]Darty K, Denise A, Ponty Y: VARNA: interactive drawing and editing of the RNA secondary structure. Bioinformatics 2009, 25(15):1974-1975.
  • [56]Rastegari B, Condon A: Parsing nucleic acid pseudoknotted secondary structure: algorithm and applications. J Comput Biol 2007, 14(1):16-32.
  • [57]Sperschneider J, Datta A, Wise MJ: Predicting pseudoknotted structures across two RNA sequences. Bioinformatics 2012, 28(23):3058-3065.
  • [58]Hajiaghayi M, Condon A, Hoos H: Analysis of energy-based algorithms for RNA secondary structure prediction. BMC Bioinformatics 2012, 13(1):22. BioMed Central Full Text
  • [59]Varian H: Bootstrap tutorial. Math J 2005, 9(4):768-775.
  • [60]Hesterberg T, Monaghan S, Moore DS, Cipson A, Epstein R: Bootstrap methods and permutation tests. In The practice of business statistics. Edited by Farace P, Ward T, Swearengin D, Donnellan B. New York: W. H. Freeman and Company; Chap. 18.
  • [61]Aghaeepour N, Hoos H: Ensemble-based prediction of RNA secondary structures. BMC Bioinformatics 2013, 14(1):139. BioMed Central Full Text
  • [62]R Core Team: R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2013. [http://www.R-project.org/ webcite]
  • [63]Andronescu M, Chuan Z, Condon A: Secondary structure prediction of interacting RNA molecules. J Mol Biol 2005, 345(5):987-1001.
  • [64]Zuker M: Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res 2003, 31:3406-3415.
  • [65]Bellaousov S, Mathews DH: ProbKnot: fast prediction of RNA secondary structure including pseudoknots. RNA 2010, 16(10):1870-1880.
  • [66]Nethercote N, Seward J: Valgrind: a framework for heavyweight dynamic binary instrumentation. SIGPLAN Not 2007, 42(6):89-100.
  文献评价指标  
  下载次数:30次 浏览次数:7次