BMC Bioinformatics | |
Inferring high-confidence human protein-protein interactions | |
Xueping Yu1  Anders Wallqvist1  Jaques Reifman1  | |
[1] Biotechnology High-Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Ft. Detrick, MD, 21702, USA | |
关键词: Protein-protein interactions; Human protein interaction network; High confidence; | |
Others : 1088291 DOI : 10.1186/1471-2105-13-79 |
|
received in 2011-10-05, accepted in 2012-05-04, 发布年份 2012 | |
【 摘 要 】
Background
As numerous experimental factors drive the acquisition, identification, and interpretation of protein-protein interactions (PPIs), aggregated assemblies of human PPI data invariably contain experiment-dependent noise. Ascertaining the reliability of PPIs collected from these diverse studies and scoring them to infer high-confidence networks is a non-trivial task. Moreover, a large number of PPIs share the same number of reported occurrences, making it impossible to distinguish the reliability of these PPIs and rank-order them. For example, for the data analyzed here, we found that the majority (>83%) of currently available human PPIs have been reported only once.
Results
In this work, we proposed an unsupervised statistical approach to score a set of diverse, experimentally identified PPIs from nine primary databases to create subsets of high-confidence human PPI networks. We evaluated this ranking method by comparing it with other methods and assessing their ability to retrieve protein associations from a number of diverse and independent reference sets. These reference sets contain known biological data that are either directly or indirectly linked to interactions between proteins. We quantified the average effect of using ranked protein interaction data to retrieve this information and showed that, when compared to randomly ranked interaction data sets, the proposed method created a larger enrichment (~134%) than either ranking based on the hypergeometric test (~109%) or occurrence ranking (~46%).
Conclusions
From our evaluations, it was clear that ranked interactions were always of value because higher-ranked PPIs had a higher likelihood of retrieving high-confidence experimental data. Reducing the noise inherent in aggregated experimental PPIs via our ranking scheme further increased the accuracy and enrichment of PPIs derived from a number of biologically relevant data sets. These results suggest that using our high-confidence protein interactions at different levels of confidence will help clarify the topological and biological properties associated with human protein networks.
【 授权许可】
2012 Yu et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150117093753294.pdf | 1854KB | download | |
Figure 6. | 114KB | Image | download |
Figure 5. | 52KB | Image | download |
Figure 4. | 48KB | Image | download |
Figure 3. | 52KB | Image | download |
Figure 2. | 70KB | Image | download |
Figure 1. | 70KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
【 参考文献 】
- [1]Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N, et al.: Towards a proteome-scale map of the human protein-protein interaction network. Nature 2005, 437(7062):1173-1178.
- [2]Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, et al.: A human protein-protein interaction network: a resource for annotating the proteome. Cell 2005, 122(6):957-968.
- [3]Ewing RM, Chu P, Elisma F, Li H, Taylor P, Climie S, McBroom-Cerajewski L, Robinson MD, O'Connor L, Li M, et al.: Large-scale mapping of human protein-protein interactions by mass spectrometry. Mol Syst Biol 2007, 3:89.
- [4]Jeronimo C, Forget D, Bouchard A, Li Q, Chua G, Poitras C, Therien C, Bergeron D, Bourassa S, Greenblatt J, et al.: Systematic analysis of the protein interaction network for the human transcription machinery reveals the identity of the 7SK capping enzyme. Mol Cell 2007, 27(2):262-274.
- [5]Sowa ME, Bennett EJ, Gygi SP, Harper JW: Defining the human deubiquitinating enzyme interaction landscape. Cell 2009, 138(2):389-403.
- [6]Suthram S, Shlomi T, Ruppin E, Sharan R, Ideker T: A direct comparison of protein interaction confidence assignment schemes. BMC Bioinforma 2006, 7:360. BioMed Central Full Text
- [7]Schelhorn SE, Mestre J, Albrecht M, Zotenko E: Inferring physical protein contacts from large-scale purification data of protein complexes. Mol Cell Proteomics 2011, 10(6):M110 004929.
- [8]Yu X, Ivanic J, Memisevic V, Wallqvist A, Reifman J: Categorizing biases in high-confidence high-throughput protein-protein interaction data sets. Mol Cell Proteomics 2011, 11:M111 012500. in press
- [9]Wodak SJ, Pu S, Vlasblom J, Seraphin B: Challenges and rewards of interaction proteomics. Mol Cell Proteomics 2009, 8(1):3-18.
- [10]Yu H, Braun P, Yildirim MA, Lemmens I, Venkatesan K, Sahalie J, Hirozane-Kishikawa T, Gebreab F, Li N, Simonis N, et al.: High-quality binary protein interaction map of the yeast interactome network. Science 2008, 322(5898):104-110.
- [11]Hakes L, Pinney JW, Robertson DL, Lovell SC: Protein-protein interaction networks and biology–what's the connection? Nat Biotechnol 2008, 26(1):69-72.
- [12]Pfeiffer T, Hoffmann R: Large-scale assessment of the effect of popularity on the reliability of research. PLoS One 2009, 4(6):e5996.
- [13]Bader GD, Betel D, Hogue CW: BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res 2003, 31(1):248-250.
- [14]Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M: BioGRID: a general repository for interaction datasets. Nucleic Acids Res 2006, 34(Database issue):D535-539.
- [15]Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D: The Database of Interacting Proteins: 2004 update. Nucleic Acids Res 2004, 32(Database issue):D449-451.
- [16]Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V, Muthusamy B, Gandhi TK, Gronborg M, et al.: Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res 2003, 13(10):2363-2371.
- [17]Aranda B, Achuthan P, Alam-Faruque Y, Armean I, Bridge A, Derow C, Feuermann M, Ghanbarian AT, Kerrien S, Khadake J, et al.: The IntAct molecular interaction database in 2010. Nucleic Acids Res 2009, 38(Database issue):D525-531.
- [18]Chatr-aryamontri A, Ceol A, Palazzi LM, Nardelli G, Schneider MV, Castagnoli L, Cesareni G: MINT: the Molecular INTeraction database. Nucleic Acids Res 2007, 35(Database issue):D572-574.
- [19]Pagel P, Kovac S, Oesterheld M, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C, Mark P, Stumpflen V, Mewes HW, et al.: The MIPS mammalian protein-protein interaction database. Bioinformatics 2005, 21(6):832-834.
- [20]Beuming T, Skrabanek L, Niv MY, Mukherjee P, Weinstein H: PDZBase: a protein-protein interaction database for PDZ-domains. Bioinformatics 2005, 21(6):827-828.
- [21]Vastrik I, D'Eustachio P, Schmidt E, Gopinath G, Croft D, de Bono B, Gillespie M, Jassal B, Lewis S, Matthews L, et al.: Reactome: a knowledge base of biologic pathways and processes. Genome Biol 2007, 8(3):R39. BioMed Central Full Text
- [22]Yu X, Ivanic J, Wallqvist A, Reifman J: A novel scoring approach for protein co-purification data reveals high interaction specificity. PLoS Comput Biol 2009, 5(9):e1000515.
- [23]Hart GT, Lee I, Marcotte ER: A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality. BMC Bioinforma 2007, 8:236. BioMed Central Full Text
- [24]Deane CM, Salwinski L, Xenarios I, Eisenberg D: Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics 2002, 1(5):349-356.
- [25]Deng M, F Sun, T Chen: Assessment of the reliability of protein-protein interactions and protein function prediction. Pacific Symposium on Biocomputing Pacific Symposium on Biocomputing 2003, 8(140):151-4376.
- [26]Goldberg DS, Roth FP: Assessing experimentally derived interactions in a small world. Proc Natl Acad Sci U S A 2003, 100(8):4372-4376.
- [27]Bossi A, Lehner B: Tissue specificity and the human protein interaction network. Mol Syst Biol 2009, 5:260.
- [28]Gillis J, Pavlidis P: The impact of multifunctional genes on "guilt by association" analysis. PLoS One 2011, 6(2):e17258.
- [29]Wynn RM, Kato M, Machius M, Chuang JL, Li J, Tomchick DR, Chuang DT: Molecular mechanism for regulation of the human mitochondrial branched-chain alpha-ketoacid dehydrogenase complex by phosphorylation. Structure 2004, 12(12):2185-2196.
- [30]Grissom PM, Vaisberg EA, McIntosh JR: Identification of a novel light intermediate chain (D2LIC) for mammalian cytoplasmic dynein 2. Mol Biol Cell 2002, 13(3):817-829.
- [31]Mikami A, Tynan SH, Hama T, Luby-Phelps K, Saito T, Crandall JE, Besharse JC, Vallee RB: Molecular structure of cytoplasmic dynein 2 and its distribution in neuronal and ciliated cells. J Cell Sci 2002, 115(Pt 24):4801-4808.
- [32]Cabello OA, Eliseeva E, He WG, Youssoufian H, Plon SE, Brinkley BR, Belmont JW: Cell cycle-dependent expression and nucleolar localization of hCAP-H. Mol Biol Cell 2001, 12(11):3527-3537.
- [33]Wang S, Zhu G, Chapoval AI, Dong H, Tamada K, Ni J, Chen L: Costimulation of T cells by B7-H2, a B7-like molecule that binds ICOS. Blood 2000, 96(8):2808-2813.
- [34]Wang S, Zhu G, Tamada K, Chen L, Bajorath J: Ligand binding sites of inducible costimulator and high avidity mutants with improved function. J Exp Med 2002, 195(8):1033-1041.
- [35]Volz A, Goke R, Lankat-Buttgereit B, Fehmann HC, Bode HP, Goke B: Molecular cloning, functional expression, and signal transduction of the GIP-receptor cloned from a human insulinoma. FEBS Lett 1995, 373(1):23-29.
- [36]Gallwitz B, Witt M, Morys-Wortmann C, Folsch UR, Schmidt WE: GLP-1/GIP chimeric peptides define the structural requirements for specific ligand-receptor interaction of GLP-1. Regul Pept 1996, 63(1):17-22.
- [37]Manhart S, Hinke SA, McIntosh CH, Pederson RA, Demuth HU: Structure-function analysis of a series of novel GIP analogues containing different helical length linkers. Biochemistry 2003, 42(10):3081-3088.
- [38]Yamada Y, Seino Y: Physiology of GIP–a lesson from GIP receptor knockout mice. Horm Metab Res 2004, 36(11–12):771-774.
- [39]Tressel T, Thompson R, Zieske LR, Menendez MI, Davis L: Interaction between L-threonine dehydrogenase and aminoacetone synthetase and mechanism of aminoacetone production. J Biol Chem 1986, 261(35):16428-16437.
- [40]Ta HX, Holm L: Evaluation of different domain-based methods in protein interaction prediction. Biochem Biophys Res Commun 2009, 390(3):357-362.
- [41]Gupta S, Wallqvist A, Bondugula R, Ivanic J, Reifman J: Unraveling the conundrum of seemingly discordant protein-protein interaction datasets. Conf Proc IEEE Eng Med Biol Soc 2010, 2010:783-786.
- [42]Finn RD, Marshall M, Bateman A: iPfam: visualization of protein-protein interactions in PDB at domain and amino acid resolutions. Bioinformatics 2005, 21(3):410-412.
- [43]Kanehisa M, Goto S: KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000, 28(1):27-30.
- [44]Zhang X, De la Cruz O, Pinto JM, Nicolae D, Firestein S, Gilad Y: Characterizing the expression of the human olfactory receptor gene family using a novel DNA microarray. Genome Biol 2007, 8(5):R86. BioMed Central Full Text
- [45]Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabasi AL: The human disease network. Proc Natl Acad Sci U S A 2007, 104(21):8685-8690.
- [46]Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 2005, 33(Database issue):D514-517.
- [47]Jongeneel CV, Delorenzi M, Iseli C, Zhou D, Haudenschild CD, Khrebtukova I, Kuznetsov D, Stevenson BJ, Strausberg RL, Simpson AJ, et al.: An atlas of human gene expression from massively parallel signature sequencing (MPSS). Genome Res 2005, 15(7):1007-1014.
- [48]Pierre S, Scholich K: Toponomics: studying protein-protein interactions and protein networks in intact tissue. Mol Biosyst 2010, 6(4):641-647.
- [49]Ivanic J, Yu X, Wallqvist A, Reifman J: Influence of protein abundance on high-throughput protein-protein interaction detection. PLoS One 2009, 4(6):e5815.
- [50]Liang S, Liu S, Zhang C, Zhou Y: A simple reference state makes a significant improvement in near-native selections from structurally refined docking decoys. Proteins 2007, 69(2):244-253.
- [51]Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer EL, et al.: The Pfam protein families database. Nucleic Acids Res 2004, 32(Database issue):D138-141.
- [52]Yu X, Lin J, Zack DJ, Qian J: Computational analysis of tissue-specific combinatorial gene regulation: predicting interaction between transcription factors in human tissues. Nucleic Acids Res 2006, 34(17):4925-4936.