期刊论文详细信息
BMC Bioinformatics
Transferring functional annotations of membrane transporters on the basis of sequence similarity and sequence motifs
Ahmad Barghash1  Volkhard Helms1 
[1] Center for Bioinformatics, Saarland University, Postfach 15 11 50, 66041 Saarbrücken, Germany
关键词: Sequence homology;    TC classification system;    Substrate;    MEME;    HMMER;    BLAST;    Functional classification;    Membrane transporter;   
Others  :  1087695
DOI  :  10.1186/1471-2105-14-343
 received in 2013-07-29, accepted in 2013-11-19,  发布年份 2013
PDF
【 摘 要 】

Background

Membrane transporters catalyze the transport of small solute molecules across biological barriers such as lipid bilayer membranes. Experimental identification of the transported substrates is very tedious. Once a particular transport mechanism has been identified in one organism, it is thus highly desirable to transfer this information to related transporter sequences in different organisms based on bioinformatics evidence.

Results

We present a thorough benchmark at which level of sequence identity membrane transporters from Escherichia coli, Saccharomyces cerevisiae, and Arabidopsis thaliana belong to the same families of the Transporter Classification (TC) system, and at what level these membrane transporters mediate the transport of the same substrate. We found that two membrane transporter sequences from different organisms that are aligned with normalized BLAST expectation value better than E-value 1e-8 are highly likely to belong to the same TC family (F-measure around 90%). Enriched sequence motifs identified by MEME at thresholds below 1e-12 support accurate classification into TC families for about two thirds of the sequences (F-measure 80% and higher). For the comparison of transported substrates, we focused on the four largest substrate classes of amino acids, sugars, metal ions, and phosphate. At similar identity thresholds, the nature of the transported substrates was more divergent (F-measure 40 - 75% at the same thresholds) than the TC family membership.

Conclusions

We suggest an acceptable threshold of 1e-8 for BLAST and HMMER where at least three quarters of the sequences are classified according to the TC system with a reasonably high accuracy. Researchers who wish to apply these thresholds in their studies should multiply these thresholds by the size of the database they search against. Our findings should be useful to those who wish to transfer transporter functional annotations across species.

【 授权许可】

   
2013 Barghash and Helms; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150117032013835.pdf 786KB PDF download
Figure 3. 72KB Image download
Figure 2. 54KB Image download
Figure 1. 31KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

【 参考文献 】
  • [1]Saier MH Jr, Yen MR, Noto K, Tamang DG, Elkan C: The Transporter Classification Database: recent advances. Nucleic Acids Res 2009, 37:D274-D278.
  • [2]Ren Q, Chen K, Paulsen IT: TransportDB: a comprehensive database resource for cytoplasmic membrane transport systems and outer membrane channels. Nucleic Acids Res 2007, 35:D274-D279.
  • [3]Cherry JM, Hong EL, Amundsen C, Balakrishnan R, Binkley G, Chan ET, Christie KR, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hirschman JE, Hitz BC, Karra K, Krieger CJ, Miyasato SR, Nash RS, Park J, Skrzypek MS, Simison M, Weng S, Wong ED: Saccharomyces Genome Database: the genomics resource of budding yeast. Nucleic Acids Res 2012, 40:D700-D705.
  • [4]Schwacke R, Schneider A, Van Der Graaff E, Fischer K, Catoni E, Desimone M, Frommer WB, Flugge UI, Kunze R: ARAMEMNON, a Novel Database for Arabidopsis Integral Membrane Proteins. Plant Physiol 2003, 131:16-26.
  • [5]Busch W, Saier MH: The Transporter Classification (TC) System. Crit Rev Biochem Mol Biol 2002, 37:287-337.
  • [6]Schaadt NS, Christoph J, Helms V: Classifying Substrate Specificities of Membrane Transporters from Arabidopsis thaliana. J Chem Inf Model 2010, 50:1899-1905.
  • [7]Schaadt NS, Helms V: Functional classification of membrane transporters and channels based on filtered TM/non-TM amino acid composition. Biopolymers 2012, 97:558-567.
  • [8]Li H, Dai X, Zhao X: A nearest neighbor approach for automated transporter prediction and categorization from protein sequences. Bioinformatics 2008, 24:1129-1136.
  • [9]Haiquan L, Benedito VA, Udvardi MK, Zhao PX: TransportTP: A two-phase classification approach for membrane transporter prediction and characterization. BMC Bioinformatics 2009, 10:418. BioMed Central Full Text
  • [10]Yabuki Y, Gromiha MM: Functional discrimination of membrane proteins using machine learning techniques. BMC Bioinformatics 2008, 9:135. BioMed Central Full Text
  • [11]Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J, Heger A, Holm L, Sonnhammer EL, Eddy SR, Bateman A, Finn RD: The Pfam protein families database. Nucleic Acids Res 2012, 40:D290-D301.
  • [12]Chang AB, Lin R, Keith Studley W, Tran CV, Saier MH Jr: Phylogeny as a guide to structure and function of membrane transport proteins. Mol Membr Biol 2004, 21:171-181.
  • [13]Chen F, Mackey AJ, Vermunt JK, Roos DS: Assessing Performance of Orthology Detection Strategies Applied to Eukaryotic Genomes. PLoS ONE 2007, 2:e383.
  • [14]Remm M, Storm CEV, Sonnhammer EL: Automatic Clustering of Orthologs and In-paralogs from Pairwise Species Comparisons. J Mol Biol 2001, 314:1041-1052.
  • [15]The Arabidopsis Genome Initiative: Analysis of the genome sequence of the fowering plant Arabidopsis thaliana. Nature 2000, 408:796-815.
  • [16]Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215:403-410.
  • [17]Sean RE: A new generation of homology search tools based on probabilistic inference. Genome Inform 2009, 23:205-211.
  • [18]Elkan C, Bailey TL: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Bio 1994, 2:28-36.
  • [19]Frith MC, Hamada M, Horton P: Parameters for accurate genome alignment. BMC Bioinformatics 2010, 11:80. BioMed Central Full Text
  • [20]Ashkenazi S, Snir R, Ofran Y: Assessing the Relationship between Conservation of Function and Conservation of Sequence Using Photosynthetic Proteins. Bioinformatics 2012, 28:3203-3210.
  • [21]Tian W, Skolnick J: How well is enzyme function conserved as a function of pairwise sequence identity? J Mol Biol 2003, 333:863-882.
  • [22]Eide DJ: The molecular biology of metal ion transport in Saccharomyces Cerevisiae. Annu Rev Nutr 1998, 18:441-469.
  • [23]Peterson CW, Narula SS, Armitage IM: 3D solution structure of copper and silver-substituted yeast metallothioneins. FEBS Lett 1996, 379:85-93.
  • [24]Williamson LC, Ribrioux SPCP, Fitter AH, Leyser HMO: Phosphate availability regulates root system architecture in Arabidopsis. Plant Physiol 2001, 26:875-882.
  • [25]Schachtman DP, Reid RJ, Ayling SM: Phosphorus Uptake by Plants: From Soil to Cell. Plant Physiol 1998, 116:447-453.
  • [26]Shin H, Shin H-S, Dewbre GR, Harrison MJ: Phosphate transport in Arabidopsis: Pht1;1 and Pht1;4 play a major role in phosphate acquisition from both low- and high-phosphate environments. Plant Journal 2004, 39:629-642.
  • [27]Williams LE, Lemoine R, Sauer N: Sugar transporters in higher plants–a diversity of roles and complex regulation. Trends Plant Sci 2000, 5:283-290.
  • [28]Gribskov M, Bailey TL: Combining evidence using p-values: application to sequence homology searches. Bioinformatics 1998, 14:48-54.
  • [29]Marsico A, Scheubert K, Tuukkanen A, Henschel A, Winter C, Winnenburg R, Schroeder M: MeMotif: a database of linear motifs in α-helical transmembrane proteins. Nucleic Acids Res 2009, 38:D181-D189.
  • [30]Frith MC, Saunders NFW, Kobe B, Bailey TL: Discovering Sequence Motifs with Arbitrary Insertions and Deletions. PLoS Comput Biol 2008, 4:e1000071.
  • [31]Khafizov K, Staritzbichler R, Mar S, Forrest LR: A Study of the Evolution of Inverted-Topology Repeats from LeuT-Fold Transporters Using AlignMe. Biochemistry 2010, 49:10702-10713.
  • [32]Paulsen IT, Sliwinski MK, Saier MH Jr: Microbial genome analyses: global comparisons of transport capabilities based on phylogenies, bioenergetics and substrate specificities. J Mol Biol 1998, 277:573-592.
  • [33]Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunesekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR, Bateman A: The Pfam protein families database. Nucleic Acids Res 2010, 36:D281-D288.
  • [34]Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP: ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 2011, 40:D1100-D1107.
  • [35]Consortium: The UniProt. Update on activities at the Universal Protein Resource (UniProt) in 2013. Nucleic Acids Res 2013, 41:D43-D47.
  • [36]Pearson WR, Lipma DJ: Improved tools for biological sequence comparison. Proc Natl Acad Sci 1988, 85:2444-2448.
  文献评价指标  
  下载次数:58次 浏览次数:18次