期刊论文详细信息
Algorithms for Molecular Biology
Sampling solution traces for the problem of sorting permutations by signed reversals
Marie-France Sagot2  Zanoni Dias1  Christian Baudet2 
[1]Institute of Computing, University of Campinas, Campinas - SP, Brazil
[2]INRIA Grenoble-Rhône-Alpes, team BAMBOO, 655 avenue de l’Europe, 38334 Montbonnot Cedex, France
关键词: Genome rearrangement;    Sampling;    Traces;    Reversals;   
Others  :  795021
DOI  :  10.1186/1748-7188-7-18
 received in 2012-02-08, accepted in 2012-06-01,  发布年份 2012
PDF
【 摘 要 】

Background

Traditional algorithms to solve the problem of sorting by signed reversals output just one optimal solution while the space of all optimal solutions can be huge. A so-called trace represents a group of solutions which share the same set of reversals that must be applied to sort the original permutation following a partial ordering. By using traces, we therefore can represent the set of optimal solutions in a more compact way. Algorithms for enumerating the complete set of traces of solutions were developed. However, due to their exponential complexity, their practical use is limited to small permutations. A partial enumeration of traces is a sampling of the complete set of traces and can be an alternative for the study of distinct evolutionary scenarios of big permutations. Ideally, the sampling should be done uniformly from the space of all optimal solutions. This is however conjectured to be P-complete.

Results

We propose and evaluate three algorithms for producing a sampling of the complete set of traces that instead can be shown in practice to preserve some of the characteristics of the space of all solutions. The first algorithm (RA) performs the construction of traces through a random selection of reversals on the list of optimal 1-sequences. The second algorithm (DFALT) consists in a slight modification of an algorithm that performs the complete enumeration of traces. Finally, the third algorithm (SWA) is based on a sliding window strategy to improve the enumeration of traces. All proposed algorithms were able to enumerate traces for permutations with up to 200 elements.

Conclusions

We analysed the distribution of the enumerated traces with respect to their height and average reversal length. Various works indicate that the reversal length can be an important aspect in genome rearrangements. The algorithms RA and SWA show a tendency to lose traces with high average reversal length. Such traces are however rare, and qualitatively our results show that, for testable-sized permutations, the algorithms DFALT and SWA produce distributions which approximate the reversal length distributions observed with a complete enumeration of the set of traces.

【 授权许可】

   
2012 Baudet et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140705080235842.pdf 2930KB PDF download
Figure 10. 89KB Image download
Figure 9. 107KB Image download
Figure 8. 55KB Image download
Figure 7. 55KB Image download
Figure 6. 148KB Image download
Figure 5. 142KB Image download
Figure 4. 27KB Image download
Figure 3. 24KB Image download
Figure 2. 106KB Image download
Figure 1. 51KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

【 参考文献 】
  • [1]Hannenhalli S, Pevzner PA: Transforming Men into Mice (Polynomial Algorithm for Genomic Distance Problem). FOCS IEEE Computer Society 1995, 581-592.
  • [2]Hannenhalli S, Pevzner PA: Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals. Journal of the ACM 1999, 46:1-27.
  • [3]Bergeron A: A Very Elementary Presentation of the Hannenhalli-Pevzner Theory. In Proceedings of the 12th Annual Symposium of the Combinatorial Pattern Matching (CPM’2001), Volume 2089 of Lecture Notes in Computer Science. Jerusalem, Israel; 2001:106-117.
  • [4]Tannier E, Bergeron A, Sagot MF: Advances on sorting by reversals. Discrete Applied Mathematics 2007, 155:881-888.
  • [5]Bader DA, Moret BME, Yan M: A Linear-Time Algorithm for Computing Inversion Distance Between Signed Permutations with an Experimental Study. Journal of Computational Biology 2001, 8(5):483-491.
  • [6]Swenson KM, Rajan V, Lin Y, Moret BME: Sorting Signed Permutations by Inversions in O(n log n) Time. Journal of Computational Biology 2010, 17(3):489-501.
  • [7]Yancopoulos S, Attie O, Friedberg R: Efficient sorting of genomic permutations by translocation, inversion and block interchange. Bioinformatics 2005, 21(16):3340-3346.
  • [8]Bergeron A, Mixtacki J, Stoye J: A new linear time algorithm to compute the genomic distance via the double cut and join distance. Theoretical Computer Science 2009, 410:5300-5316.
  • [9]Braga MDV, Stoye J: The Solution Space of Sorting by DCJ. Journal of Computational Biology 2010, 17(9):1145-1165.
  • [10]Miklós I, Tannier E: Bayesian sampling of genomic rearrangement scenarios via double cut and join. Bioinformatics 2010, 26(24):3012-3019.
  • [11]Siepel AC: An Algorithm to Enumerate Sorting Reversals. Journal of Computational Biology 2003, 10(3-4):575-597.
  • [12]Swenson KM, Badr G, Sankoff D: Listing all sorting reversals in quadratic time. Algorithms for Molecular Biology 2011, 6:11. BioMed Central Full Text
  • [13]York TL, Durrett R, Nielsen R: Bayesian Estimation of the Number of Inversions in the History of Two Chromosomes. Journal of Computational Biology 2002, 9(6):805-818.
  • [14]Durrett R, Nielsen R, York TL: Bayesian Estimation of Genomic Distance. Genetics 2004, 166:621-629.
  • [15]Miklós I: MCMC genome rearrangement. Bioinformatics 2003, 19(Suppl. 2):ii130-ii137.
  • [16]Larget B, Simon DL, Kadane JB, Sweet D: A Bayesian Analysis of Metazoa Mitochondrial Genome Arrangements. Molecular Biology and Evolution 2005, 22(3):486-495.
  • [17]Larget B, Kadane JB, Simon DL: A Bayesian approach to the estimation of ancestral genome arrangements. Molecular Phylogenetics and Evolution 2005, 36:214-223.
  • [18]Miklós I, Darling AE: Efficient Sampling of Parsimonious Inversion Histories with Application to Genome Rearrangement in Yersinia. Genome Biology and Evolution 2009, 1:153-164.
  • [19]Bergeron A, Chauve C, Hartman T, Saint-Onge K: On the Properties of Sequences of Reversals that Sort a Signed Permutation. In Proceedings of the JOBIM 2002. Saint Malo; 2002:99-108.
  • [20]Cartier P, Foata D: Problèmes combinatoires de commutation et réarrangements. No. 85 in Lecture Notes in Mathematics. Berlin: Springer-Verlag; 1969.
  • [21]Braga MDV, Sagot MF, Scornavacca C, Tannier E: Exploring the solution space of sorting by reversals with experiments and an application to evolution. Transactions on Computational Biology and Bioinformatics 2008, 5(3):348-356.
  • [22]Braga MDV, Gautier C, Sagot MF: An asymmetric approach to preserve common intervals while sorting by reversals. Algorithms for Molecular Biology 2009, 4:16. BioMed Central Full Text
  • [23]Braga MDV: Exploring the Solution Space of Sorting by Reversals When Analyzing Genome Rearrangements. France: Université Lyon 1; 2008.
  • [24]Baudet C, Dias Z: An Improved Algorithm to Enumerate All Traces that Sort a Signed Permutation by Reversals. In Proceedings of the 25th Symposium On Applied Computing (ACM SAC 2010). Sierre, Switzerland 2010: [5 pages, Bioinformatics Track];
  • [25]Badr G, Swenson K, Sankoff D: Listing All Parsimonious Reversal Sequences: New Algorithms and Perspectives. In Proceedings of the 8th Annual RECOMB Satellite Workshop on Comparative Genomics (RECOMB-CG 2010), Volume 6398 of Lecture Notes in Bioinformatics. Edited by Tannier E. Ottawa, Canada: Springer-Verlag Berlin Heidelberg; 2010:39-49.
  • [26]Lefebvre JF, El-Mabrouk N, Tillier E, Sankoff D: Detection and validation of single gene inversions. Bioinformatics 2003, 19:i190-i196.
  • [27]Cáceres M, Barbadilla A, Ruiz A: Recombination Rate Predicts Inversion Size in Diptera. Genetics 1999, 153:251-259.
  • [28]Darling AE, Miklós I, Ragan MA: Dynamics of Genome Rearrangement in Bacterial Populations. PLOS Genetics 2008, 4(7):1-16.
  • [29]Sankoff D, Lefebvre JF, Tillier E, Maler A, El-Mabrouk N: The Distribution of Inversion Lengths in Bacteria. In Proceedings of the 2nd Annual RECOMB Satellite Workshop on Comparative Genomics (RECOMB-CG 2004), Volume 3388 of Lecture Notes in Bioinformatics. Edited by Lagergren J. Bertinoro, Italy: Springer-Verlag Berlin Heidelberg; 2005:97-108.
  • [30]Braga MDV: baobabLuna: the solution space of sorting by reversals. Bioinformatics 2009, 25(14):1833-1835. [Applications Notes]
  • [31]Swenson KM, Lin Y, Rajan V, Moret BM: Hurdles Hardly Have to Be Heeded. In Proceedings of the International Workshop on Comparative Genomics (RECOMB-CG’08) Volume 5267 of Lecture Notes in Computer Science. Paris; 2008:241-251.
  文献评价指标  
  下载次数:193次 浏览次数:37次