期刊论文详细信息
BMC Bioinformatics
Gentrepid V2.0: a web server for candidate disease gene prediction
Merridee A Wouters1  Diane Fatkin2  Bruno Gaeta6  Martin Oti3  Arthur Liu4  Naresh Bains4  Richard A George4  Jason Y Liu4  Sara Ballouz5 
[1]School of Medicine, Deakin University, Geelong, VIC 3217, Australia
[2]Molecular Cardiology and Biophysics Division, Victor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia
[3]Centre for Molecular and Biomolecular Informatics, Radboud University Nijmegen Medical Centre, Nijmegen, The Netherlands
[4]Structural and Computational Biology Department, Victor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia
[5]Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, 500 Sunnyside Boulevard, 11797, Woodbury, NY, USA
[6]School of Computer Science and Engineering, University of New South Wales, Kensington, NSW 2052, Australia
关键词: Hypertension;    Genetic-association studies;    Candidate gene identification;    Phenotype;    Genotype;    Genome-wide association studies;    Complex diseases;    Mendelian diseases;    Candidate disease genes;    Candidate disease gene prediction;   
Others  :  1087787
DOI  :  10.1186/1471-2105-14-249
 received in 2012-11-22, accepted in 2013-08-13,  发布年份 2013
PDF
【 摘 要 】

Background

Candidate disease gene prediction is a rapidly developing area of bioinformatics research with the potential to deliver great benefits to human health. As experimental studies detecting associations between genetic intervals and disease proliferate, better bioinformatic techniques that can expand and exploit the data are required.

Description

Gentrepid is a web resource which predicts and prioritizes candidate disease genes for both Mendelian and complex diseases. The system can take input from linkage analysis of single genetic intervals or multiple marker loci from genome-wide association studies. The underlying database of the Gentrepid tool sources data from numerous gene and protein resources, taking advantage of the wealth of biological information available. Using known disease gene information from OMIM, the system predicts and prioritizes disease gene candidates that participate in the same protein pathways or share similar protein domains. Alternatively, using an ab initio approach, the system can detect enrichment of these protein annotations without prior knowledge of the phenotype.

Conclusions

The system aims to integrate the wealth of protein information currently available with known and novel phenotype/genotype information to acquire knowledge of biological mechanisms underpinning disease. We have updated the system to facilitate analysis of GWAS data and the study of complex diseases. Application of the system to GWAS data on hypertension using the ICBP data is provided as an example. An interesting prediction is a ZIP transporter additional to the one found by the ICBP analysis. The webserver URL is https://www.gentrepid.org/ webcite.

【 授权许可】

   
2013 Ballouz et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150117043258404.pdf 1710KB PDF download
Figure 1. 107KB Image download
【 图 表 】

Figure 1.

【 参考文献 】
  • [1]Cardon LR, Bell JI: Association study designs for complex diseases. Nat Rev Genet 2001, 2(2):91-99.
  • [2]Cantor RM, Lange K, Sinsheimer JS: Prioritizing GWAS results: a review of statistical methods and recommendations for their application. Am J Hum Genet 2010, 86(1):6-22.
  • [3]Tranchevent LC, Capdevila FB, Nitsch D, De Moor B, De Causmaecker P, Moreau Y: A guide to web tools to prioritize candidate genes. Brief Bioinform 2011, 12(1):22-32.
  • [4]Ballouz S, Liu J, Oti M, Gaeta B, Fatkin D, Bahlo M, Wouters M: Analysis of genome-wide association study data using the protein knowledge base. BMC Genet 2011, 12(1):98.
  • [5]Badano JL, Katsanis N: Beyond mendel: an evolving view of human genetic disease transmission. Nat Rev Genet 2002, 3(10):779-789.
  • [6]Teber E, Liu J, Ballouz S, Fatkin D, Wouters M: Comparison of automated candidate gene prediction systems using genes implicated in type 2 diabetes by genome-wide association studies. BMC Bioinforma 2009, 10(Suppl 1):S69. BioMed Central Full Text
  • [7]Adie EA, Adams RR, Evans KL, Porteous DJ, Pickard BS: SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics 2006, 22(6):773-774.
  • [8]Alarcón-Riquelme M: Role of RUNX in autoimmune diseases linking rheumatoid arthritis, psoriasis and lupus. Arthritis Res Ther 2004, 6:169-173. BioMed Central Full Text
  • [9]Franke L, Van Bakel H, Fokkens L, De Jong ED, Egmont-Petersen M, Wijmenga C: Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet 2006, 78(6):1011-1025.
  • [10]George RA, Liu JY, Feng LL, Bryson-Richardson RJ, Fatkin D, Wouters MA: Analysis of protein sequence and interaction data for candidate disease gene prediction. Nucleic Acids Res 2006, 34(19):e130.
  • [11]Tranchevent LC, Barriot R, Yu S, Van Vooren S, Van Loo P, Coessens B, De Moor B, Aerts S, Moreau Y: ENDEAVOUR update: a web resource for gene prioritization in multiple species. Nucleic Acids Res 2008, 36(suppl_2):W377-W384.
  • [12]Pruitt KD, Tatusova T, Maglott DR: NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 2007, 35(suppl_1):D61-D65.
  • [13]Brown KR, Jurisica I: Online predicted human interaction database. Bioinformatics 2005, 21(9):2076-2082.
  • [14]Bader GD, Betel D, Hogue CWV: BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res 2003, 31(1):248-250.
  • [15]Zanzoni A, Montecchi-Palazzi L, Quondam M, Ausiello G, Helmer-Citterich M, Cesareni G: MINT: a Molecular INTeraction database. FEBS Lett 2002, 513(1):135-140.
  • [16]Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V, Muthusamy B, Gandhi TKB, Gronborg M, et al.: Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res 2003, 13(10):2363-2371.
  • [17]Nishimura D: BioCarta. Biotech Software & Internet Report 2001, 2(3):117-120.
  • [18]Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M: The KEGG resource for deciphering the genome. Nucleic Acids Res 2004, 32:D277-D280.
  • [19]Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K: dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 2001, 29(1):308-311.
  • [20]Hamosh A, Scott AF, Amberger J, Bocchini C, Valle D, McKusick VA: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 2002, 30(1):52-55.
  • [21]Emanuelsson O, Brunak S, Von Heijne G, Nielsen H: Locating proteins in the cell using TargetP, SignalP and related tools. Nat Protoc 2007, 2(4):953-971.
  • [22]Delorenzi M, Speed T: An HMM model for coiled-coil domains and a comparison with PSSM-based predictions. Bioinformatics 2002, 18(4):617-625.
  • [23]Wolf E, Kim PS, Berger B: MultiCoil: a program for predicting two-and three-stranded coiled coils. Protein Sci 1997, 6(6):1179-1189.
  • [24]Krogh A, Larsson B, Von Heijne G, Sonnhammer ELL: Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 2001, 305(3):567-580.
  • [25]Freudenberg J, Propping P: A similarity-based method for genome-wide prediction of disease-relevant human genes. Bioinformatics 2002, 18:S110-S115.
  • [26]Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer ELL: The Pfam protein families database. Nucleic Acids Res 2002, 30(1):276-280.
  • [27]Eddy S: Profile hidden Markov models. Bioinformatics 1998, 14(9):755-763.
  • [28]Pearson WR, Lipman DJ: Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A 1988, 85(8):2444-2448.
  • [29]Oti M, Snel B, Huynen MA, Brunner HG: Predicting disease genes using protein-protein interactions. J Med Genet 2006, 43(8):691-698.
  • [30]Jimenez-Sanchez G, Childs B, Valle D: Human disease genes. Nature 2001, 409(6822):853-855.
  • [31]George RA, Heringa J: Protein domain identification and improved sequence similarity searching using PSI-BLAST. Proteins 2002, 48(4):672-681.
  • [32]Rebhan M, Chalifa-Caspi V, Prilusky J, Lancet D: GeneCards: a novel functional genomics compendium with automated data mining and query reformulation support. Bioinformatics 1998, 14(8):656-664.
  • [33]Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D: The human genome browser at UCSC. Genome Res 2002, 12(6):996-1006.
  • [34]Raychaudhuri S, Plenge RM, Rossin EJ, Ng ACY, Purcell SM, Sklar P, Scolnick EM, Xavier RJ, Altshuler D, Daly MJ, et al.: Identifying relationships among genomic disease regions: predicting genes at pathogenic SNP associations and rare deletions. PLoS Genet 2009, 5(6):e1000534.
  • [35]Sparrow DB, Guillén-Navarro E, Fatkin D, Dunwoodie SL: Mutation of hairy-and-enhancer-of-split-7 in humans causes spondylocostal dysostosis. Hum Mol Genet 2008, 17(23):3761-3766.
  • [36]Sparrow DB, Sillence D, Wouters MA, Turnpenny PD, Dunwoodie SL: Two novel missense mutations in hairy-and-enhancer-of-split-7 in a family with spondylocostal dysostosis. Eur J Hum Genet 2010, 18(6):674-679.
  • [37]Dietterich T: Overfitting and undercomputing in machine learning. ACM computing surveys (CSUR) 1995, 27(3):326-327.
  • [38]Ehret GB, Munroe PB, Rice KM, Bochud M, Johnson AD, Chasman DI, Smith AV, Tobin MD, Verwoert GC, Hwang S-J: Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature 2011, 478(7367):103-109.
  • [39]Levy D, Ehret GB, Rice K, Verwoert GC, Launer LJ, Dehghan A, Glazer NL, Morrison AC, Johnson AD, Aspelund T, et al.: Genome-wide association study of blood pressure and hypertension. Nat Genet 2009, 41(6):677-687.
  • [40]Newton-Cheh C, Johnson T, Gateva V, Tobin MD, Bochud M, Coin L, Najjar SS, Zhao JH, Heath SC, Eyheramendy S: Genome-wide association study identifies eight loci associated with blood pressure. Nat Genet 2009, 41(6):666-676.
  • [41]Torielli L, Tivodar S, Montella RC, Iacone R, Padoani G, Tarsini P, Russo O, Sarnataro D, Strazzullo P, Ferrari P, et al.: α-Adducin mutations increase Na/K pump activity in renal cells by affecting constitutive endocytosis: implications for tubular Na reabsorption. Am J Physiol Renal Physiol 2008, 295(2):F478-F487.
  • [42]Fyhrquist F, Saijonmaa O: Renin-angiotensin system revisited. J Intern Med 2008, 264(3):224-236.
  • [43]Carey RM, Siragy HM: Newly recognized components of the renin-angiotensin system: potential roles in cardiovascular and renal regulation. Endocr Rev 2003, 24(3):261-271.
  • [44]Grossman E: Does increased oxidative stress cause hypertension? Diabetes Care 2008, 31(Supplement 2):S185-S189.
  • [45]Guijarro C, Egido J: Transcription factor-kappa B (NF-kappa B) and renal disease. Kidney Int 2001, 59(2):415-424.
  • [46]He L, Wang B, Hay EB, Nebert DW: Discovery of ZIP transporters that participate in cadmium damage to testis and kidney. Toxicol Appl Pharmacol 2009, 238(3):250-257.
  • [47]Liu Z, Li H, Soleimani M, Girijashanker K, Reed JM, He L, Dalton TP, Nebert DW: Cd 2+ versus Zn 2+ uptake by the ZIP8-dependent symporter: kinetics, electrogenicity and trafficking. Biochem Biophys Res Commun 2008, 365(4):814-820.
  • [48]Liu X, Yu X, Zack D, Zhu H, Qian J: TiGER: a database for tissue-specific gene expression and regulation. BMC Bioinforma 2008, 9(1):271. BioMed Central Full Text
  • [49]Edgar R, Domrachev M, Lash AE: Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 2002, 30(1):207-210.
  • [50]Parkinson H, Kapushesky M, Shojatalab M, Abeygunawardena N, Coulson R, Farne A, Holloway E, Kolesnykov N, Lilja P, Lukk M: ArrayExpress-a public database of microarray experiments and gene expression profiles. Nucleic Acids Res 2007, 35(suppl 1):D747-D750.
  • [51]Bairoch A, Apweiler R: The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res 2000, 28(1):45-48.
  • [52]Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN: The COG database: an updated version includes eukaryotes. BMC Bioinforma 2003, 4(1):41. BioMed Central Full Text
  • [53]Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J, Binns D, Harte N, Lopez R, Apweiler R: The Gene Ontology Annotation (GOA) database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res 2004, 32(suppl_1):D262-D266.
  • [54]Consortium EP: A user’s guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 2011, 9(4):e1001046.
  • [55]Meola N, Gennarino V, Banfi S: MicroRNAs and genetic diseases. PathoGenetics 2009, 2(1):7. BioMed Central Full Text
  • [56]Kleinjan DJ, Coutinho P: Cis-ruption mechanisms: disruption of cis-regulatory control as a cause of human genetic disease. Brief Funct Genomic Proteomic 2009, 8(4):317-332.
  文献评价指标  
  下载次数:4次 浏览次数:15次