期刊论文详细信息
BMC Bioinformatics
morFeus: a web-based program to detect remotely conserved orthologs using symmetrical best hits and orthology network scoring
Bianca H Habermann3  Vineeth Surendranath4  Felix Oswald2  Jose M Villaveces3  Malvika Sharan1  Michael Volkmer3  Ines Wagner4 
[1]IMIB, Julius-Maximilians-University Würzburg, Josef-Schneider-Strasse 2 - Building D15, Würzburg 97080, Germany
[2]Section Physics of Living Systems, Department of Physics and Astronomy & Laser Centre, VU University Amsterdam, De Boelelaan 1081, Office U0.30, Amsterdam 1081 HV, Netherlands
[3]Max Planck Institute of Biochemistry, Am Klopferspitz 18, Martinsried 82152, Germany
[4]Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstrasse 108, Dresden 01307, Germany
关键词: Meta-analysis based orthology finder using symmetrical best hits;    Eigenvector centrality;    Orthology network;    Reciprocal best hit;    Alignment clustering;    Orthology;    Remote sequence conservation;   
Others  :  1087529
DOI  :  10.1186/1471-2105-15-263
 received in 2014-05-19, accepted in 2014-07-21,  发布年份 2014
PDF
【 摘 要 】

Background

Searching the orthologs of a given protein or DNA sequence is one of the most important and most commonly used Bioinformatics methods in Biology. Programs like BLAST or the orthology search engine Inparanoid can be used to find orthologs when the similarity between two sequences is sufficiently high. They however fail when the level of conservation is low. The detection of remotely conserved proteins oftentimes involves sophisticated manual intervention that is difficult to automate.

Results

Here, we introduce morFeus, a search program to find remotely conserved orthologs. Based on relaxed sequence similarity searches, morFeus selects sequences based on the similarity of their alignments to the query, tests for orthology by iterative reciprocal BLAST searches and calculates a network score for the resulting network of orthologs that is a measure of orthology independent of the E-value. Detecting remotely conserved orthologs of a protein using morFeus thus requires no manual intervention. We demonstrate the performance of morFeus by comparing it to state-of-the-art orthology resources and methods. We provide an example of remotely conserved orthologs, which were experimentally shown to be functionally equivalent in the respective organisms and therefore meet the criteria of the orthology-function conjecture.

Conclusions

Based on our results, we conclude that morFeus is a powerful and specific search method for detecting remotely conserved orthologs. morFeus is freely available at http://bio.biochem.mpg.de/morfeus/ webcite. Its source code is available from Sourceforge.net (https://sourceforge.net/p/morfeus/ webcite).

【 授权许可】

   
2014 Wagner et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150117013345639.pdf 1684KB PDF download
Figure 3. 144KB Image download
Figure 2. 171KB Image download
Figure 1. 43KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

【 参考文献 】
  • [1]Fitch WM: Distinguishing homologous from analogous proteins. Syst Zool 1970, 19(2):99-113.
  • [2]Gabaldon T, Koonin EV: Functional and evolutionary implications of gene orthology. Nat Rev Genet 2013, 14(5):360-366.
  • [3]Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V, Church DM, Dicuccio M, Federhen S, Feolo M, Fingerman IM, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Lu Z, Madden TL, Madej T, Maglott DR, Marchler-Bauer A, Miller V, Mizrachi I, Ostell J, Panchenko A, Phan L, Pruitt KD, Schuler GD, Sequeira E, et al.: Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2011, 39:D38-D51.
  • [4]O’Brien KP, Remm M, Sonnhammer EL: Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Res 2005, 33(Database issue):D476-D480.
  • [5]Kersey PJ, Staines DM, Lawson D, Kulesha E, Derwent P, Humphrey JC, Hughes DS, Keenan S, Kerhornou A, Koscielny G, Langridge N, McDowall MD, Megy K, Maheswari U, Nuhn M, Paulini M, Pedro H, Toneva I, Wilson D, Yates A, Birney E: Ensembl Genomes: an integrative resource for genome-scale data from non-vertebrate species. Nucleic Acids Res 2012, 40(Database issue):D91-D97.
  • [6]Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, Goldman M, Barber GP, Clawson H, Coelho A, Diekhans M, Dreszer TR, Giardine BM, Harte RA, Hillman-Jackson J, Hsu F, Kirkup V, Kuhn RM, Learned K, Li CH, Meyer LR, Pohl A, Raney BJ, Rosenbloom KR, Smith KE, Haussler D, Kent WJ: The UCSC Genome Browser database: update 2011. Nucleic Acids Res 2011, 39(Database issue):D876-D882.
  • [7]Ostlund G, Schmitt T, Forslund K, Kostler T, Messina DN, Roopra S, Frings O, Sonnhammer EL: InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res 2010, 38(Database issue):D196-D203.
  • [8]Datta RS, Meacham C, Samad B, Neyer C, Sjolander K: Berkeley PHOG: PhyloFacts orthology group prediction web server. Nucleic Acids Res 2009, 37(Web Server issue):W84-W89.
  • [9]Afrasiabi C, Samad B, Dineen D, Meacham C, Sjolander K: The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification. Nucleic Acids Res 2013, 41(Web Server issue):W242-W248.
  • [10]Li H, Coghlan A, Ruan J, Coin LJ, Heriche JK, Osmotherly L, Li R, Liu T, Zhang Z, Bolund L, Wong GK, Zheng W, Dehal P, Wang J, Durbin R: TreeFam: a curated database of phylogenetic trees of animal gene families. Nucleic Acids Res 2006, 34(Database issue):D572-D580.
  • [11]Huerta-Cepas J, Capella-Gutierrez S, Pryszcz LP, Denisov I, Kormes D, Marcet-Houben M, Gabaldon T: PhylomeDB v3.0: an expanding repository of genome-wide collections of trees, alignments and phylogeny-based orthology and paralogy predictions. Nucleic Acids Res 2011, 39(Database issue):D556-D560.
  • [12]Vilella AJ, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E: EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 2009, 19(2):327-335.
  • [13]Li L, Stoeckert CJ Jr, Roos DS: OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 2003, 13(9):2178-2189.
  • [14]Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389-3402.
  • [15]Eddy SR: Profile hidden Markov models. Bioinformatics 1998, 14(9):755-763.
  • [16]Eddy SR: Hidden Markov models. Curr Opin Struct Biol 1996, 6(3):361-365.
  • [17]Remmert M, Biegert A, Hauser A, Soding J: HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nat Methods 2012, 9(2):173-175.
  • [18]Soding J, Remmert M, Biegert A, Lupas AN: HHsenser: exhaustive transitive profile search using HMM-HMM comparison. Nucleic Acids Res 2006, 34(Web Server issue):W374-W378.
  • [19]Pearl FM, Lee D, Bray JE, Buchan DW, Shepherd AJ, Orengo CA: The CATH extended protein-family database: providing structural annotations for genome sequences. Protein Sci 2002, 11(2):233-244.
  • [20]Muller A, MacCallum RM, Sternberg MJ: Benchmarking PSI-BLAST in genome annotation. J Mol Biol 1999, 293(5):1257-1271.
  • [21]Park J, Karplus K, Barrett C, Hughey R, Haussler D, Hubbard T, Chothia C: Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol 1998, 284(4):1201-1210.
  • [22]Szklarczyk R, Wanschers BF, Cuypers TD, Esseling JJ, Riemersma M, van den Brand MA, Gloerich J, Lasonder E, van den Heuvel LP, Nijtmans LG, Huynen MA: Iterative orthology prediction uncovers new mitochondrial proteins and identifies C12orf62 as the human ortholog of COX14, a protein involved in the assembly of cytochrome c oxidase. Genome Biol 2012, 13(2):R12. BioMed Central Full Text
  • [23]Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Church DM, DiCuccio M, Edgar R, Federhen S, Helmberg W, Kenton DL, Khovayko O, Lipman DJ, Madden TL, Maglott DR, Ostell J, Pontius JU, Pruitt KD, Schuler GD, Schriml LM, Sequeira E, Sherry ST, Sirotkin K, Starchenko G, Suzek TO, Tatusov R, Tatusova TA, Wagner L, Yaschenko E: Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2005, 33(Database issue):D39-D45.
  • [24]Schwickart M, Havlis J, Habermann B, Bogdanova A, Camasses A, Oelschlaegel T, Shevchenko A, Zachariae W: Swm1/Apc13 is an evolutionarily conserved subunit of the anaphase-promoting complex stabilizing the association of Cdc16 and Cdc27. Mol Cell Biol 2004, 24(8):3562-3576.
  • [25]Kann MG, Goldstein RA: Performance evaluation of a new algorithm for the detection of remote homologs with sequence comparison. Proteins 2002, 48(2):367-376.
  • [26]Bonacich PB: Factoring and weighing approaches to status scores and clique identification. J Math Sociol 1972, 2:113-120.
  • [27]Hagberg AA, Schult DA, Swart PJ: Exploring network structure, dynamics and function using NetworkX. In Proceedings of the 7th Python in Science Conference (SciPy2008). Edited by Varoquaux G, Vaught T, Millman J. Pasadena, CA USA; 2008:11-15.
  • [28]Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T: Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 2011, 27(3):431-432.
  • [29]Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA: The COG database: an updated version includes eukaryotes. BMC bioinformatics 2003, 4:41. BioMed Central Full Text
  • [30]Jensen LJ, Julien P, Kuhn M, von Mering C, Muller J, Doerks T, Bork P: eggNOG: automated construction and annotation of orthologous groups of genes. Nucleic Acids Res 2008, 36(Database issue):D250-D254.
  • [31]Wall DP, Fraser HB, Hirsh AE: Detecting putative orthologs. Bioinformatics 2003, 19(13):1710-1711.
  文献评价指标  
  下载次数:18次 浏览次数:4次