期刊论文详细信息
BMC Genomics
Phylogeny analysis from gene-order data with massive duplications
Research
Bing Feng1  Lingxi Zhou1  Yu Lin2  Jijun Tang3  Jieyi Zhao4 
[1] Department of Computer Science and Engineering, University of South Carolina, 29208, Columbia, South Carolina, USA;Research School of Computer Science, Australian National University, 2601, Canberra, ACT, Australia;School of Computer Science and Engineering, Tianjin University, 300072, Tianjin, China;Department of Computer Science and Engineering, University of South Carolina, 29208, Columbia, South Carolina, USA;University of Texas School of Biomedical Informatics at Houston, 77030, Houston, Texas, USA;
关键词: Phylogeny reconstruction;    Maximum likelihood;    Variable length binary encoding;    Whole genome duplication;   
DOI  :  10.1186/s12864-017-4129-0
来源: Springer
PDF
【 摘 要 】

BackgroundGene order changes, under rearrangements, insertions, deletions and duplications, have been used as a new type of data source for phylogenetic reconstruction. Because these changes are rare compared to sequence mutations, they allow the inference of phylogeny further back in evolutionary time. There exist many computational methods for the reconstruction of gene-order phylogenies, including widely used maximum parsimonious methods and maximum likelihood methods. However, both methods face challenges in handling large genomes with many duplicated genes, especially in the presence of whole genome duplication.MethodsIn this paper, we present three simple yet powerful methods based on maximum-likelihood (ML) approaches that encode multiplicities of both gene adjacency and gene content information for phylogenetic reconstruction.ResultsExtensive experiments on simulated data sets show that our new method achieves the most accurate phylogenies compared to existing approaches. We also evaluate our method on real whole-genome data from eleven mammals. The package is publicly accessible at http://www.geneorder.org.ConclusionsOur new encoding schemes successfully incorporate the multiplicity information of gene adjacencies and gene content into an ML framework, and show promising results in reconstruct phylogenies for whole-genome data in the presence of massive duplications.

【 授权许可】

CC BY   
© The Author(s) 2017

【 预 览 】
附件列表
Files Size Format View
RO202311099359230ZK.pdf 1091KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  文献评价指标  
  下载次数:8次 浏览次数:0次