期刊论文详细信息
Algorithms for Molecular Biology
Unrooted unordered homeomorphic subtree alignment of RNA trees
Michal Ziv-Ukelson1  Yefim Dinitz1  Eitan Bachmat1  Erez Katzenelson1  Shay Zakov2  Nimrod Milo1 
[1]Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, Israel
[2]Department of Computer Science and Engineering, UC San Diego, La Jolla, CA, USA
关键词: Cavity matching;    algorithm;    Unrooted;    Homeomorphic subtree alignment;    Tree alignment;    RNA structure;   
Others  :  793343
DOI  :  10.1186/1748-7188-8-13
 received in 2012-12-20, accepted in 2013-02-05,  发布年份 2013
PDF
【 摘 要 】

We generalize some current approaches for RNA tree alignment, which are traditionally confined to ordered rooted mappings, to also consider unordered unrooted mappings. We define the Homeomorphic Subtree Alignment problem (HSA), and present a new algorithm which applies to several modes, combining global or local, ordered or unordered, and rooted or unrooted tree alignments. Our algorithm generalizes previous algorithms that either solved the problem in an asymmetric manner, or were restricted to the rooted and/or ordered cases. Focusing here on the most general unrooted unordered case, we show that for input trees T and S, our algorithm has an O(nTnS + min(dT,dS)LTLS) time complexity, where nT,LT and dT are the number of nodes, the number of leaves, and the maximum node degree in T, respectively (satisfying dT ≤ LT ≤ nT), and similarly for nS,LS and dS with respect to the tree S. This improves the time complexity of previous algorithms for less general variants of the problem.

In order to obtain this time bound for HSA, we developed new algorithms for a generalized variant of the Min-Cost Bipartite Matching problem (MCM), as well as to two derivatives of this problem, entitled All-Cavity-MCM and All-Pairs-Cavity-MCM. For two input sets of size n and m, where n ≤ m, MCM and both its cavity derivatives are solved in O(n3 + nm) time, without the usage of priority queues (e.g. Fibonacci heaps) or other complex data structures. This gives the first cubic time algorithm for All-Pairs-Cavity-MCM, and improves the running times of MCM and All-Cavity-MCM problems in the unbalanced case where n ≪ m.

We implemented the algorithm (in all modes mentioned above) as a graphical software tool which computes and displays similarities between secondary structures of RNA given as input, and employed it to a preliminary experiment in which we ran all-against-all inter-family pairwise alignments of RNAse P and Hammerhead RNA family members, exposing new similarities which could not be detected by the traditional rooted ordered alignment approaches. The results demonstrate that our approach can be used to expose structural similarity between some RNAs with higher sensitivity than the traditional rooted ordered alignment approaches. Source code and web-interface for our tool can be found in http://www.cs.bgu.ac.il/\~negevcb/FRUUT webcite.

【 授权许可】

   
2013 Milo et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140705050403200.pdf 2197KB PDF download
Figure 14. 33KB Image download
Figure 13. 111KB Image download
Figure 12. 110KB Image download
Figure 11. 91KB Image download
Figure 10. 113KB Image download
Figure 9. 37KB Image download
Figure 8. 49KB Image download
Figure 8. 49KB Image download
Figure 8. 49KB Image download
Figure 8. 49KB Image download
Figure 8. 49KB Image download
Figure 8. 49KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 7. 141KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 6. 66KB Image download
Figure 5. 76KB Image download
Figure 5. 76KB Image download
Figure 5. 76KB Image download
Figure 5. 76KB Image download
Figure 5. 76KB Image download
Figure 5. 76KB Image download
Figure 5. 76KB Image download
Figure 5. 76KB Image download
Figure 5. 76KB Image download
Figure 5. 76KB Image download
Figure 5. 76KB Image download
Figure 5. 76KB Image download
Figure 4. 51KB Image download
Figure 4. 51KB Image download
Figure 4. 51KB Image download
Figure 4. 51KB Image download
Figure 4. 51KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 3. 43KB Image download
Figure 2. 112KB Image download
Figure 2. 112KB Image download
Figure 1. 115KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 2.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 3.

Figure 4.

Figure 4.

Figure 4.

Figure 4.

Figure 4.

Figure 5.

Figure 5.

Figure 5.

Figure 5.

Figure 5.

Figure 5.

Figure 5.

Figure 5.

Figure 5.

Figure 5.

Figure 5.

Figure 5.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 6.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 7.

Figure 8.

Figure 8.

Figure 8.

Figure 8.

Figure 8.

Figure 8.

Figure 9.

Figure 10.

Figure 11.

Figure 12.

Figure 13.

Figure 14.

【 参考文献 】
  • [1]Agmon I, Auerbach T, Baram D, Bartels H, Bashan A, Berisio R, Fucini P, Hansen H, Harms J, Kessler M, et al.: On peptide bond formation, translocation, nascent protein progression and the regulatory properties of ribosomes. Eur J Biochem 2003, 270(12):2543-2556.
  • [2]Hofacker I: Vienna RNA secondary structure server. Nucleic Acids Res 2003, 31(13):3429.
  • [3]Steffen P, Voss B, Rehmsmeier M, Reeder J, Giegerich R: RNAshapes: anintegrated RNA analysis package based on abstract shapes. Bioinformatics 2006, 22(4):500-503.
  • [4]Höchsmann M, Toller T, Giegerich R, Kurtz S: Local similarity in RNA secondary structures. In Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE. IEEE; 2003:159-168.
  • [5]Jiang T, Lin G, Ma B, Zhang K: A general edit distance between RNA structures. J Comput Biol 2002, 9(2):371-388.
  • [6]Zhang K, Wang L, Ma B: Computing similarity between RNA structures. In Combinatorial Pattern Matching. Springer; 1999:281-293.
  • [7]Bille P: A survey on tree edit distance and related problems. Theor Comput Sci 2005, 337(1-3):217-239.
  • [8]Jiang T, Wang L, Zhang K: Alignment of trees—an alternative to tree edit. Theor Comput Sci 1995, 143:137-148.
  • [9]Zhang K: Computing similarity between RNA secondary structures. In INTSYS ’98: Proceedings of the IEEE International Joint Symposia on Intelligence and Systems. Washington: IEEE Computer Society; 1998:126-126.
  • [10]Le S, Nussinov R, Maizel J: Tree graphs of RNA secondary structures and their comparisons. Comput Biomed Res 1989, 22(5):461-473.
  • [11]Schirmer S, Giegerich R: Forest alignment with affine gaps and anchors. In Combinatorial Pattern Matching. Springer; 2011:104-117.
  • [12]Hofacker I, Fontana W, Stadler P, Bonhoeffer L, Tacker M, Schuster P: Fast folding and comparison of RNA secondary structures. Monatshefte fur Chemie/Chemical Monthly 1994, 125(2):167-188.
  • [13]Liu J, Wang J, Hu J, Tian B: A method for aligning RNA secondary structures and its application to RNA motif detection. BMC Bioinformatics 2005, 6:89. BioMed Central Full Text
  • [14]Blin G, Denise A, Dulucq S, Herrbach C, Touzet H: Alignments of RNA structures. Comput Biol Bioinformatics, IEEE/ACM Trans 2010, 7(2):309-322.
  • [15]Allali J, Sagot M: A multiple graph layers model with application to RNA secondary structures comparison. In String Processing and Information Retrieval. Springer; 2005:348-359.
  • [16]Jan E: Divergent IRES elements in invertebrates. Virus Res 2006, 119:16-28.
  • [17]Perreault J, Weinberg Z, Roth A, Popescu O, Chartrand P, Ferbeyre G, Breaker R: Identification of hammerhead ribozymes in all domains of life reveals novel structural variations. PLoS Comput Biol 2011, 7(5):e1002031.
  • [18]Birikh K, Heaton P, Eckstein F: The structure, function and application of the hammerhead ribozyme. Eur J Biochem 1997, 245:1-16.
  • [19]Haas E, Brown J: Evolutionary variation in bacterial RNase P RNAs. Nucleic Acids Res 1998, 26(18):4093-4099.
  • [20]Zhang K, Jiang T: Some MAX SNP-hard results concerning unordered labeled trees. Inf Process Lett 1994, 49(5):249-254.
  • [21]Matula D: Subtree isomorphism in O(n5/2). Ann Discrete Math 1978, 2:91-106.
  • [22]Shamir R, Tsur D: Faster subtree isomorphism. J Algorithms 1999, 33:267-280.
  • [23]Chung M: O(n2.5) time algorithms for the subgraph homeomorphism problem on trees. J Algorithms 1987, 8:106-112.
  • [24]Reyner S: An analysis of a good algorithm for the subtree problem. SIAM J Comput 1977, 6:730.
  • [25]Valiente G: Constrained tree inclusion. J Discrete Algorithms 2005, 3(2-4):431-447.
  • [26]Pinter RY, Rokhlenko O, Tsur D, Ziv-Ukelson M: Approximate labelled subtree homeomorphism. J Discrete Algorithms 2008, 6(3):480-496.
  • [27]Zhang K: A constrained edit distance between unordered labeled trees. Algorithmica 1996, 15(3):205-222.
  • [28]Kao M, Lam T, Sung W, Ting H: Cavity matchings, label compressions, and unrooted evolutionary trees. SIAM J Comput 2000, 30(2):602-624.
  • [29]Dinic E: On solution of two assignment problems. In Studies in Discrete Optimization. Edited by Fridman A. Nauka. Moscow: Nauka; 1976:333-348.
  • [30]Edmonds J, Karp R: Theoretical improvements in algorithmic efficiency for network flow problems. J ACM (JACM) 1972, 19(2):248-264.
  • [31]Fredman M, Tarjan R: Fibonacci heaps and their uses in improved network optimization algorithms. J ACM (JACM) 1987, 34(3):596-615.
  • [32]Gabow H, Tarjan R: Faster scaling algorithms for network problems. SIAM J Comput 1989, 18:1013.
  • [33]Orlin J, Ahuja R: New scaling algorithms for the assignment and minimum mean cycle problems. Math Program 1992, 54:41-56.
  • [34]Needleman S, Wunsch C, et al.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 1970, 48(3):443-453.
  • [35]Maes M: On a cyclic string-to-string correction problem. Inf Process Lett 1990, 35(2):73-78.
  • [36]Schmidt JP: All highest scoring paths in weighted grid graphs and their application to finding all approximate repeats in strings. SIAM J Comput 1998, 27(4):972-992.
  • [37]Tiskin A: Semi-local string comparison: Algorithmic techniques and applications. Math Comput Sci 2008, 1(4):571-603.
  • [38]Zhang K: Algorithms for the constrained editing distance between ordered labeled trees and related problems. Pattern Recognit 1995, 28(3):463-474.
  • [39]Tarjan R: Data Structures and Network Algorithms, Volume 44. Society for, Industrial Mathematics; 1983.
  • [40]Ahuja R, Magnanti T, Orlin J, Weihe K: Network flows: theory, algorithms and applications. ZOR-Methods Models Oper Res 1995, 41(3):252-254.
  • [41]Blum M, Floyd R, Pratt V, Rivest R, Tarjan R: Time bounds for selection. J Comput Syst Sci 1973, 7(4):448-461.
  • [42]Dijkstra E: A note on two problems in connexion with graphs. Numerische mathematik 1959, 1:269-271.
  • [43]Lawler E: Combinatorial Optimization: Networks and Matroids. New York: Holt,Rinehart and Winston; 1976.
  • [44]Ford Jr L, Fulkerson D, Ziffer A: Flows in networks. Phys Today 1963, 16:54.
  • [45]Shapiro B: An algorithm for comparing multiple RNA secondary structures. Comput Appl Biosci 1986, 4(3):387-393.
  • [46]Waterman M: Secondary structure of single-stranded nucleic acids. Adv Math Suppl Studies 1978, 1:167-212.
  • [47]Fontana W, Konings D, Stadler P, Schuster P: Statistics of RNA secondary structures. Biopolymers 1993, 33(9):1389-1404.
  • [48]Höchsmann M, Voss B, Giegerich R: Pure multiple RNA secondary structure alignments: a progressive profile approach. IEEE Trans Comput Biol Bioinformatics 2004, 1:53-62.
  • [49]Klein R, Eddy S: RSEARCH: finding homologs of single structured RNA sequences. BMC Bioinformatics 2003, 4:44. BioMed Central Full Text
  • [50]Andronescu M, Bereg V, Hoos HH, Condon A: RNA STRAND: the RNA secondary structure and statistical analysis database. BMC Bioinformatics 2008, 9:340. BioMed Central Full Text
  • [51]Massey Jr F: The Kolmogorov-Smirnov test for goodness of fit. J Am Stat Assoc 1951, 46:68-78.
  • [52]Pace NR, Brown JW: Evolutionary perspective on the structure and function of ribonuclease P, a ribozyme. J Bacteriol 1995, 177(8):1919-1928.
  • [53]Brown J: The ribonuclease P database. Nucleic Acids Res 1999, 27:314.
  • [54]Murray J, Terwey D, Maloney L, Karpeisky A, Usman N, Beigelman L, Scott W: The structural basis of hammerhead ribozyme self-cleavage. Cell 1998, 92(5):665-673.
  • [55]Hean J, Weinberg M: The hammerhead ribozyme revisited: new biological insights. In RNA and the Regulation of Gene Expression: A Hidden Layer of Complexity. Edited by Morris KV. Caister Academic, Pr; 2008:1-1.
  • [56]Pley H, Lindes D, DeLuca-Flaherty C, McKay D: Crystals of a hammerhead ribozyme. J Biol Chem 1965, 268(26):6.
  • [57]Scott W, Finch J, Klug A: The crystal structure of an all-RNA hammerhead ribozyme. In Nucleic Acids Symposium Series, Volume 34. IRL PRESS LTD; 1995:214-216.
  文献评价指标  
  下载次数:455次 浏览次数:10次