期刊论文详细信息
Molecular Systems Biology
Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours
Takuji Yamada2  Alison S Waller2  Jeroen Raes1  Aleksej Zelezniak2  Nadia Perchat3  Alain Perret3  Marcel Salanoubat3  Kiran R Patil2  Jean Weissenbach3 
[1] Molecular and Cellular Interactions Department, VIB, Brussels, Belgium;Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany;Commissariat à l'Energie Atomique, Evry, France
关键词: genomics;    metabolic pathways;    metagenomics;    neighbourhood information;    orphan enzymes;   
DOI  :  10.1038/msb.2012.13
来源: Wiley
PDF
【 摘 要 】

Abstract

Despite the current wealth of sequencing data, one-third of all biochemically characterized metabolic enzymes lack a corresponding gene or protein sequence, and as such can be considered orphan enzymes. They represent a major gap between our molecular and biochemical knowledge, and consequently are not amenable to modern systemic analyses. As 555 of these orphan enzymes have metabolic pathway neighbours, we developed a global framework that utilizes the pathway and (meta)genomic neighbour information to assign candidate sequences to orphan enzymes. For 131 orphan enzymes (37% of those for which (meta)genomic neighbours are available), we associate sequences to them using scoring parameters with an estimated accuracy of 70%, implying functional annotation of 16 345 gene sequences in numerous (meta)genomes. As a case in point, two of these candidate sequences were experimentally validated to encode the predicted activity. In addition, we augmented the currently available genome-scale metabolic models with these new sequence–function associations and were able to expand the models by on average 8%, with a considerable change in the flux connectivity patterns and improved essentiality prediction.

Synopsis

Many characterized metabolic enzymes currently lack associated gene and protein sequences. Here, pathway and genomic neighbour data are used to assign genes to these ‘orphan enzymes,’ and the predictions are validated with experimental assays and genome-scale metabolic modelling.

display math
  • A computational method is developed for assigning candidate sequences to orphan enzymes. The method uses metabolic pathway, genomic neighbourhood, genomic co-occurrence, and protein domain information to predict genes that are likely to perform a particular enzymatic function.
  • Benchmarking of the scoring scheme based on the 4 features above revealed that some combinations of parameters yielded greater than 70% accuracy, and that high-confidence predictions could be generated for 131 orphan enzymes.
  • Enzyme assay experiments confirmed the predicted enzymatic activity for two of the high-confidence candidate sequences.
  • Predicted functions can improve the annotation of genomic and metagenomic data, and can reveal putative genes for enzymes with potential biotechnological applications.
  • Incorporating the predicted enzymatic reactions into genome-scale metabolic models changed the flux connectivity and improved their ability to correctly predict gene essentiality, supporting the biological relevance of these predictions.

【 授权许可】

CC BY-NC-SA   
Copyright © 2012 EMBO and Macmillan Publishers Limited

Creative Commons Attribution License, which permits distribution, and reproduction in any medium, provided the original author and source are credited. This license does not permit commercial exploitation without specific permission.

【 预 览 】
附件列表
Files Size Format View
RO202107150008209ZK.pdf 555KB PDF download
  文献评价指标  
  下载次数:1次 浏览次数:1次