期刊论文详细信息
PLoS Pathogens
Genome-Scale Identification of Legionella pneumophila Effectors Using a Machine Learning Approach
David Burstein1  Tal Pupko1  Gil Segal2  Tal Zusman2  Ram Viner2  Elena Degtyar2 
[1] Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel;Department of Molecular Microbiology and Biotechnology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel
关键词: Legionella pneumophila;    Machine learning algorithms;    Secretion systems;    Pathogenesis;    Gene prediction;    Genomic signal processing;    Machine learning;    Bacterial pathogens;   
DOI  :  10.1371/journal.ppat.1000508
学科分类:生物科学(综合)
来源: Public Library of Science
PDF
【 摘 要 】

A large number of highly pathogenic bacteria utilize secretion systems to translocate effector proteins into host cells. Using these effectors, the bacteria subvert host cell processes during infection. Legionella pneumophila translocates effectors via the Icm/Dot type-IV secretion system and to date, approximately 100 effectors have been identified by various experimental and computational techniques. Effector identification is a critical first step towards the understanding of the pathogenesis system in L. pneumophila as well as in other bacterial pathogens. Here, we formulate the task of effector identification as a classification problem: each L. pneumophila open reading frame (ORF) was classified as either effector or not. We computationally defined a set of features that best distinguish effectors from non-effectors. These features cover a wide range of characteristics including taxonomical dispersion, regulatory data, genomic organization, similarity to eukaryotic proteomes and more. Machine learning algorithms utilizing these features were then applied to classify all the ORFs within the L. pneumophila genome. Using this approach we were able to predict and experimentally validate 40 new effectors, reaching a success rate of above 90%. Increasing the number of validated effectors to around 140, we were able to gain novel insights into their characteristics. Effectors were found to have low G+C content, supporting the hypothesis that a large number of effectors originate via horizontal gene transfer, probably from their protozoan host. In addition, effectors were found to cluster in specific genomic regions. Finally, we were able to provide a novel description of the C-terminal translocation signal required for effector translocation by the Icm/Dot secretion system. To conclude, we have discovered 40 novel L. pneumophila effectors, predicted over a hundred additional highly probable effectors, and shown the applicability of machine learning algorithms for the identification and characterization of bacterial pathogenesis determinants.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO201902019816656ZK.pdf 502KB PDF download
  文献评价指标  
  下载次数:17次 浏览次数:3次