期刊论文详细信息
BMC Bioinformatics
ConSole: using modularity of Contact maps to locate Solenoid domains in protein structures
Thomas Hrabe1  Adam Godzik1 
[1] Program in Bioinformatics and Systems Biology, Sanford-Burnham Medical Research Institute, 92037 La Jolla, CA, USA
关键词: Machine learning;    Template matching;    Contact map;    Solenoid structure;    Protein repeat detection;   
Others  :  818638
DOI  :  10.1186/1471-2105-15-119
 received in 2014-02-03, accepted in 2014-04-17,  发布年份 2014
PDF
【 摘 要 】

Background

Periodic proteins, characterized by the presence of multiple repeats of short motifs, form an interesting and seldom-studied group. Due to often extreme divergence in sequence, detection and analysis of such motifs is performed more reliably on the structural level. Yet, few algorithms have been developed for the detection and analysis of structures of periodic proteins.

Results

ConSole recognizes modularity in protein contact maps, allowing for precise identification of repeats in solenoid protein structures, an important subgroup of periodic proteins. Tests on benchmarks show that ConSole has higher recognition accuracy as compared to Raphael, the only other publicly available solenoid structure detection tool. As a next step of ConSole analysis, we show how detection of solenoid repeats in structures can be used to improve sequence recognition of these motifs and to detect subtle irregularities of repeat lengths in three solenoid protein families.

Conclusions

The ConSole algorithm provides a fast and accurate tool to recognize solenoid protein structures as a whole and to identify individual solenoid repeat units from a structure. ConSole is available as a web-based, interactive server and is available for download at http://console.sanfordburnham.org webcite.

【 授权许可】

   
2014 Hrabe and Godzik; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140711130624552.pdf 2723KB PDF download
Figure 7. 30KB Image download
Figure 6. 73KB Image download
Figure 5. 136KB Image download
Figure 4. 85KB Image download
Figure 3. 119KB Image download
Figure 2. 72KB Image download
Figure 1. 141KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

【 参考文献 】
  • [1]Kajava AV: Tandem repeats in proteins: from sequence to structure. J Struct Biol 2012, 179:279-288.
  • [2]Kobe B, Kajava AV: The leucine-rich repeat as a protein recognition motif. Curr Opin Struct Biol 2001, 11:725-732.
  • [3]Sedgwick SG, Smerdon SJ: The ankyrin repeat: a diversity of interactions on a common structural framework. Trends Biochem Sci 1999, 24:311-316.
  • [4]Tewari R, Bailes E, Bunting K a, Coates JC: Armadillo-repeat protein functions: questions for little creatures. Trends Cell Biol 2010, 20:470-481.
  • [5]Kobe B, Kajava AV: When protein folding is simplified to protein coiling: the continuum of solenoid protein structures. Trends Biochem Sci 2000, 25:509-515.
  • [6]Walsh I, Sirocco FG, Minervini G, Di Domenico T, Ferrari C, Tosatto SCE: RAPHAEL: recognition, periodicity and insertion assignment of solenoid protein structures. Bioinformatics 2012, 28:3257-3264.
  • [7]Proell M, Riedl SJ, Fritz JH, Rojas AM, Schwarzenbacher R: The Nod-like receptor (NLR) family: a tale of similarities and differences. PLoS One 2008, 3:e2119.
  • [8]Kawai T, Akira S: Toll-like receptors and their crosstalk with other innate receptors in infection and immunity. Immunity 2011, 34:637-650.
  • [9]Neuwald AF, Liu JS, Lawrence CE: Gibbs motif sampling: detection of bacterial outer membrane protein repeats. Protein Sci 1995, 4:1618-1632.
  • [10]Heger A, Holm L: Rapid automatic detection and alignment of repeats in protein sequences. Proteins 2000, 41:224-237.
  • [11]Biegert A, Söding J: De novo identification of highly diverged protein repeats by probabilistic consistency. Bioinformatics 2008, 24:807-814.
  • [12]Newman AM, Cooper JB: XSTREAM: a practical algorithm for identification and architecture modeling of tandem repeats in protein sequences. BMC Bioinforma 2007, 8:382. BioMed Central Full Text
  • [13]Marsella L, Sirocco F, Trovato A, Seno F, Tosatto SCE: REPETITA: detection and discrimination of the periodicity of protein solenoid repeats by discrete Fourier transform. Bioinformatics 2009, 25:i289-i295.
  • [14]Vo A, Nguyen N, Huang H: Solenoid and non-solenoid protein recognition using stationary wavelet packet transform. Bioinformatics 2010, 26:i467-i473.
  • [15]Murray KB, Taylor WR, Thornton JM: Toward the detection and validation of repeats in protein structure. Proteins 2004, 57:365-380.
  • [16]Sabarinathan R, Basu R, Sekar K: ProSTRIP: a method to find similar structural repeats in three-dimensional protein structures. Comput Biol Chem 2010, 34:126-130.
  • [17]Parra R, Espada R, Sánchez I: Detecting repetitions and periodicities in proteins by tiling the structural space. J Phys Chem B 2013, 117:12887-12897.
  • [18]Holm L, Sander C: Mapping the protein universe. Science 1996, 273:595-603.
  • [19]Holm L, Sander C: Protein structure comparison by alignment of distance matrices. J Mol Biol 1993, 233:123-138.
  • [20]Fariselli P, Olmea O: Progress in predicting inter-residue contacts of proteins with neural networks and correlated mutations. Proteins 2001, 162:157-162.
  • [21]Bartoli L, Capriotti E, Fariselli P, Martelli PL, Casadio R: The pros and cons of predicting protein contact maps. Methods Mol Biol 2008, 413:199-217.
  • [22]Vehlow C, Stehr H, Winkelmann M, Duarte JM, Petzold L, Dinse J, Lappe M: CMView: interactive contact map visualization and analysis. Bioinformatics 2011, 27:1573-1574.
  • [23]Godzik A, Skolnick J, Kolinski A: Regularities in interaction patterns of globular proteins. Protein Eng 1993, 6:801-810.
  • [24]Kumar BVKV, Mahalanobis A, Juday RD: Correlation Pattern Recognition. Cambridge: Cambridge University Press; 2006. http://www.cambridge.org/us/academic/subjects/engineering/image-processing-and-machine-vision/correlation-pattern-recognition?format=HB webcite
  • [25]Boser B, Guyon I, Vapnik V: A Training Algorithm for Optimal Margin Classifiers. Proc. of the 5th Ann. ACM Workshop on Comp. Learning Theory 1992. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.3818 webcite
  • [26]Ye Y, Godzik A: Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 2003, 19(Suppl 2):246-255.
  • [27]Ye Y, Godzik A: Multiple flexible structure alignment using partial order graphs. Bioinformatics 2005, 21:2362-2369.
  • [28]Altman RB, Gerstein M: Finding an average core structure: application to the globins. Proc Int Conf Intell Syst Mol Biol 1994, 2:19-27.
  • [29]Crooks G, Hon G: WebLogo: a sequence logo generator. Genome Re 2004, 14:1188-1190.
  • [30]Cock PJ a, Antao T, Chang JT, Chapman B a, Cox CJ, Dalke A, Friedberg I, Hamelryck T, Kauff F, Wilczynski B, de Hoon MJL: Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics 2009, 25:1422-1423.
  • [31]Hrabe T, Chen Y, Pfeffer S, Cuellar LK, Mangold A-V, Förster F: PyTom: a python-based toolbox for localization of macromolecules in cryo-electron tomograms and subtomogram analysis. J Struct Biol 2012, 178:177-188.
  • [32]Pedregosa F, Varoquaux G: Scikit-learn: machine learning in python. J Mach Learn Res 2011, 12:2825-2830.
  • [33]Baldi P, Brunak S, Chauvin Y: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 2000, 16:412-424.
  • [34]Kajava AV: Review: proteins with repeated sequence structural prediction and modeling. J Struct Biol 2001, 134:132-144.
  • [35]Bella J, Hindle KL, McEwan PA, Lovell SC: The leucine-rich repeat structure. Cell Mol Life Sci 2008, 65:2307-2333.
  • [36]Alvarez M: Triose-phosphate Isomerase (TIM) of the Psychrophilic Bacterium Vibrio marinus. Kinetic and structural properties. J Biol Chem 1998, 273:2199-2206.
  • [37]Medzhitov R: Toll-like receptors and innate immunity. Nat Rev Immunol 2001, 1:135-145.
  文献评价指标  
  下载次数:44次 浏览次数:4次