期刊论文详细信息
BMC Genomics
Orthonome – a new pipeline for predicting high quality orthologue gene sets applicable to complete and draft genomes
Software
Ary A. Hoffmann1  Thu Nguyen1  Rahul V. Rane2  John G. Oakeshott3  Siu F. Lee4 
[1] Bio21 Institute, School of Biosciences, The University of Melbourne, Melbourne, Victoria, Australia;Bio21 Institute, School of Biosciences, The University of Melbourne, Melbourne, Victoria, Australia;CSIRO, Canberra, Australian Capital Territory, Australia;CSIRO, Canberra, Australian Capital Territory, Australia;CSIRO, Canberra, Australian Capital Territory, Australia;Department of Biological Sciences, Macquarie University, Sydney, New South Wales, Australia;
关键词: Orthologue;    Inparalogue;    Gene duplication;    Gene birth;   
DOI  :  10.1186/s12864-017-4079-6
 received in 2017-01-15, accepted in 2017-08-21,  发布年份 2017
来源: Springer
PDF
【 摘 要 】

BackgroundDistinguishing orthologous and paralogous relationships between genes across multiple species is essential for comparative genomic analyses. Various computational approaches have been developed to resolve these evolutionary relationships, but strong trade-offs between precision and recall of orthologue prediction remains an ongoing challenge.ResultsHere we present Orthonome, an orthologue prediction pipeline, designed to reduce the trade-off between orthologue capture rates (recall) and accuracy of multi-species orthologue prediction. The pipeline compares sequence domains and then forms sequence-similar clusters before using phylogenetic comparisons to identify inparalogues. It then corrects sequence similarity metrics for fragment and gene length bias using a novel scoring metric capturing relationships between full length as well as fragmented genes. The remaining genes are then brought together for the identification of orthologues within a phylogenetic framework. The orthologue predictions are further calibrated along with inparalogues and gene births, using synteny, to identify novel orthologous relationships. We use 12 high quality Drosophila genomes to show that, compared to other orthologue prediction pipelines, Orthonome provides orthogroups with minimal error but high recall. Furthermore, Orthonome is resilient to suboptimal assembly/annotation quality, with the inclusion of draft genomes from eight additional Drosophila species still providing >6500 1:1 orthologues across all twenty species while retaining a better combination of accuracy and recall than other pipelines. Orthonome is implemented as a searchable database and query tool along with multiple-sequence alignment browsers for all sets of orthologues. The underlying documentation and database are accessible at http://www.orthonome.com.ConclusionWe demonstrate that Orthonome provides a superior combination of orthologue capture rates and accuracy on complete and draft drosophilid genomes when tested alongside previously published pipelines. The study also highlights a greater degree of evolutionary conservation across drosophilid species than earlier thought.

【 授权许可】

CC BY   
© The Author(s). 2017

【 预 览 】
附件列表
Files Size Format View
RO202311099579072ZK.pdf 1362KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  文献评价指标  
  下载次数:7次 浏览次数:0次