期刊论文详细信息
BMC Systems Biology
Precise generation of systems biology models from KEGG pathways
Andreas Zell1  Andreas Dräger1  Manuel Ruff1  Finja Büchel1  Clemens Wrzodek1 
[1] Center for Bioinformatics Tuebingen (ZBIT), University of Tuebingen, Sand 1, 72076 Tübingen, Germany
关键词: Comparison;    Converter;    Quantitative modeling;    Qualitative modeling;    Systems biology;    Modeling;    BioPAX;    SBML;    KGML;    KEGG;   
Others  :  1143047
DOI  :  10.1186/1752-0509-7-15
 received in 2012-06-27, accepted in 2013-01-25,  发布年份 2013
PDF
【 摘 要 】

Background

The KEGG PATHWAY database provides a plethora of pathways for a diversity of organisms. All pathway components are directly linked to other KEGG databases, such as KEGG COMPOUND or KEGG REACTION. Therefore, the pathways can be extended with an enormous amount of information and provide a foundation for initial structural modeling approaches. As a drawback, KGML-formatted KEGG pathways are primarily designed for visualization purposes and often omit important details for the sake of a clear arrangement of its entries. Thus, a direct conversion into systems biology models would produce incomplete and erroneous models.

Results

Here, we present a precise method for processing and converting KEGG pathways into initial metabolic and signaling models encoded in the standardized community pathway formats SBML (Levels 2 and 3) and BioPAX (Levels 2 and 3). This method involves correcting invalid or incomplete KGML content, creating complete and valid stoichiometric reactions, translating relations to signaling models and augmenting the pathway content with various information, such as cross-references to Entrez Gene, OMIM, UniProt ChEBI, and many more.

Finally, we compare several existing conversion tools for KEGG pathways and show that the conversion from KEGG to BioPAX does not involve a loss of information, whilst lossless translations to SBML can only be performed using SBML Level 3, including its recently proposed qualitative models and groups extension packages.

Conclusions

Building correct BioPAX and SBML signaling models from the KEGG database is a unique characteristic of the proposed method. Further, there is no other approach that is able to appropriately construct metabolic models from KEGG pathways, including correct reactions with stoichiometry. The resulting initial models, which contain valid and comprehensive SBML or BioPAX code and a multitude of cross-references, lay the foundation to facilitate further modeling steps.

【 授权许可】

   
2013 Wrzodek et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150328224109309.pdf 694KB PDF download
Figure 3. 71KB Image download
Figure 2. 86KB Image download
Figure 1. 137KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

【 参考文献 】
  • [1]Kanehisa M, Goto S: KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 2000, 28:27-30.
  • [2]Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M: From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res 2006, 34(Database issue):D354-D357. [http://dx.doi.org/10.1093/nar/gkj102 webcite]
  • [3]Bauer-Mehren A, Furlong LI, Sanz F: Pathway databases and tools for their exploitation: benefits, current limitations and challenges. Mol Syst Biol 2009, 5:290. [http://dx.doi.org/10.1038/msb.2009.47 webcite]
  • [4]Oberhardt MA, Palsson BØ, Papin JA: Applications of genome-scale metabolic reconstructions. Mol Syst Biol 2009, 5:320. http://dx.doi.org/10.1038/msb.2009.77 webcite
  • [5]Finney A, Hucka M: Systems biology markup language: Level 2 and beyond. Biochem Soc Trans 2003, 31(Pt 6):1472-1473. [http://dx.doi.org/10.1042/ webcite]
  • [6]Demir E, Cary MP, Paley S, Fukuda K, Lemer C, Vastrik I, Wu G, D’Eustachio P, Schaefer C, Luciano J: The BioPAX community standard for pathway data sharing. Nat Biotechnol 2010, 28(9):935-942. [http://dx.doi.org/10.1038/nbt.1666 webcite]
  • [7]Funahashi A, Matsuoka Y, Jouraku A, Morohashi M, Kikuchi N, Kitano H: CellDesigner 3.5: A Versatile Modeling Tool for Biochemical Networks. Proceedings of the IEEE 2008, 96(8):1254-1265.
  • [8]Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T: Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 2011, 27(3):431-432. [http://dx.doi.org/10.1093/bioinformatics/btq675 webcite]
  • [9]Dräger A, Hassis N, Supper J, Schröder A, Zell A: SBMLsqueezer: a CellDesigner plug-in to generate kinetic rate equations for biochemical networks. BMC Syst Biol 2008, 2:39. [http://www.biomedcentral.com/1752-0509/2/39 webcite] BioMed Central Full Text
  • [10]Hoppe A, Hoffmann S, Gerasch A, Gille C, Holzhütter HG: FASIMU flexible software for flux-balance computation series in large metabolic networks. BMC Bioinformatics 2011, 12:28. [http://dx.doi.org/10.1186/1471-2105-12-28 webcite] BioMed Central Full Text
  • [11]Funahashi A, Jouraku A, Kitano H: Converting KEGG pathway database to SBML. 8th Annual International Conference on Research in Computational Molecular Biology (RECOMB) 2004.
  • [12]Küntzer J, Backes C, Blum T, Gerasch A, Kaufmann M, Kohlbacher O, Lenhof HP: BNDB – the Biochemical Network Database. BMC Bioinformatics 2007, 8:367. [http://dx.doi.org/10.1186/1471-2105-8-367 webcite] BioMed Central Full Text
  • [13]Moutselos K, Kanaris I, Chatziioannou A, Maglogiannis I, Kolisis FN: KEGGconverter: a tool for the in-silico modelling of metabolic networks of the KEGG Pathways database. BMC Bioinformatics 2009, 10:324. [http://dx.doi.org/10.1186/1471-2105-10-324 webcite] BioMed Central Full Text
  • [14]Lee KE, Jang MH, Rhie A, Thong CT, Yang S, Park HS: Java DOM Parsers to Convert KGML into SBML and BioPAX Common Exchange Formats. Genomics & Informatics 2010, 8(2):94-96. [http://ids.postech.ac.kr/~myunghaj/papers/javaDomParsers.PDF webcite]
  • [15]Wrzodek C, Dräger A, Zell A: KEGGtranslator: visualizing and converting the KEGG PATHWAY database to various formats. Bioinformatics 2011, 27(16):2314-2315. [http://dx.doi.org/10.1093/bioinformatics/btr377 webcite]
  • [16]KEGG team: KEGG Markup Language. 2010. [http://www.genome.jp/kegg/xml/docs/ webcite]. [Specification available from the KEGG homepage at [http://www.kegg.jp/kegg/xml/docs/ webcite]. Accessed 2012, April 23]
  • [17]Courtot M, Juty N, Knüpfer C, Waltemath D, Zhukova A, Dräger A, Dumontier M, Finney A, Golebiewski M, Hastings J, Hoops S, Keating S, Kell DB, Kerrien S, Lawson J, Lister A, Lu J, Machne R, Mendes P, Pocock M, Rodriguez N, Villeger A, Wilkinson DJ, Wimalaratne S, Laibe C, Hucka M, Le Novère N: Controlled vocabularies and semantics in systems biology. Mol Syst Biol 2011, 7:543. [http://dx.doi.org/10.1038/msb.2011.77 webcite]
  • [18]Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, Eilbeck K, Lewis S, Marshall B, Mungall C, Richter J, Rubin GM, Blake JA, Bult C, Dolan M, Drabkin H, Eppig JT, Hill DP, Ni L, Ringwald M, Balakrishnan R, Cherry JM, Christie KR, Costanzo MC, Dwight SS, Engel S, Fisk DG, Hirschman JE, Hong EL, Nash RS, Sethuraman A, Theesfeld CL, Botstein D, Dolinski K, Feierbach B, Berardini T, Mundodi S, Rhee SY, Apweiler R, Barrell D, Camon E, Dimmer E, Lee V, Chisholm R, Gaudet P, Kibbe W, Kishore R, Schwarz EM, Sternberg P, Gwinn M, Hannick L, Wortman J, Berriman M, Wood V, de la Cruz N, Tonellato P, Jaiswal P, Seigfried T, White R, Consortium GO: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res 2004, 32(Database issue):D258-D261.
  • [19]Hermjakob H, Montecchi-Palazzi L, Bader G, Wojcik J, Salwinski L, Ceol A, Moore S, Orchard S, Sarkans U, von Mering C, Roechert B, Poux S, Jung E, Mersch H, Kersey P, Lappe M, Li Y, Zeng R, Rana D, Nikolski M, Husi H, Brun C, Shanker K, Grant SGN, Sander C, Bork P, Zhu W, Pandey A, Brazma A, Jacq B, Vidal M, Sherman D, Legrain P, Cesareni G, Xenarios I, Eisenberg D, Steipe B, Hogue C, Apweiler R: The HUPO PSI’s molecular interaction format–a community standard for the representation of protein interaction data. Nat Biotechnol 2004, 22(2):177-183. [http://dx.doi.org/10.1038/nbt926 webcite]
  • [20]Montecchi-Palazzi L, Beavis R, Binz PA, Chalkley RJ, Cottrell J, Creasy D, Shofstahl J, Seymour SL, Garavelli JS: The PSI-MOD community standard for representation of protein modification data. Nat Biotechnol 2008, 26(8):864-866. [http://dx.doi.org/10.1038/nbt0808-864 webcite]
  • [21]Juty N, Le Novère N, Laibe C: Identifiers.org and MIRIAM Registry: community resources to provide persistent identification. Nucleic Acids Res 2012, 40(Database issue):D580-D586. [http://dx.doi.org/10.1093/nar/gkr1097 webcite]
  • [22]Le Novère N, Finney A, Hucka M, Bhalla US, Campagne F, Collado-Vides J, Crampin EJ, Halstead M, Klipp E, Mendes P, Nielsen P, Sauro H, Shapiro B, Snoep JL, Spence HD, Wanner BL: Minimum information requested in the annotation of biochemical models (MIRIAM). Nat Biotechnol 2005, 23(12):1509-1515. [http://dx.doi.org/10.1038/nbt1156 webcite]
  • [23]Hucka M: Groups Proposal. 2009. [Specification available from http://sbml.org/Community/Wiki/SBML_Level_3_Proposals/Groups_Proposal_%282009-10%29 webcite Accessed 2012, April 23]
  • [24]Berenguier D, Chaouiya C, Naldi A, Thieffry D, van Iersel MP: Qualitative Models (qual). 2011. [Specification available at http://sbml.org/Community/Wiki/SBML_Level_3_Proposals/Qualitative_Models webcite Accessed 2012, March 22]
  • [25]Dräger A, Rodriguez N, Dumousseau M, Dörr A, Wrzodek C, Le Novère N, Zell A, Hucka M: JSBML: a flexible Java library for working with SBML. Bioinformatics 2011, 27(15):2167-2168. [http://bioinformatics.oxfordjournals.org/content/27/15/2167 webcite]
  • [26]Hucka M, Hoops S, Keating S, Le Novère N, Sahle S, Wilkinson DJ: Systems Biology Markup Language (SBML) Level 2. Structures and Facilities for Model Definitions 2008. [Specification available from Nature Precedings http://dx.doi.org/10.1038/npre.2008.2715.1 webcite. Accessed 2012, March 22]
  • [27]Hucka M, Bergmann FT, Hoops S, Keating S, Sahle S, Schaff JC, Smith LP, Wilkinson DJ: The Systems Biology Markup Language (SBML): Language Specification for Level 3 Version 1 Core. 2010. [Specification available from Nature Precedings http://dx.doi.org/10.1038/npre.2010.4959.1 webcite. Accessed 2012, March 22]
  • [28]Wittig U, Kania R, Golebiewski M, Rey M, Shi L, Jong L, Algaa E, Weidemann A, Sauer-Danzwith H, Mir S, Krebs O, Bittkowski M, Wetsch E, Rojas I, Müller W: SABIO-RK–database for biochemical reaction kinetics. Nucleic Acids Res 2012, 40(Database issue):D790-D796. [http://dx.doi.org/10.1093/nar/gkr1046 webcite]
  • [29]Raymond GM, Butterworth E, Bassingthwaighte JB: JSIM: Free software package for teaching phyiological modeling and research. Exper Biol 2003, 280(5):p102.
  • [30]Zinovyev A, Viara E, Calzone L, Barillot E: BiNoM: a Cytoscape plugin for manipulating and analyzing biological networks. Bioinformatics 2008, 24(6):876-877. [http://dx.doi.org/10.1093/bioinformatics/btm553 webcite]
  • [31]van Iersel MP, Kelder T, Pico AR, Hanspers K, Coort S, Conklin BR, Evelo C: Presenting and exploring biological pathways with PathVisio. BMC Bioinformatics 2008, 9:399. [http://dx.doi.org/10.1186/1471-2105-9-399 webcite] BioMed Central Full Text
  • [32]Klukas C, Schreiber F: Integration of -omics data and networks for biomedical research with VANTED. J Integr Bioinform 2010, 7(2):112. [http://dx.doi.org/10.2390/biecoll-jib-2010-112 webcite]
  • [33]Swainston N, Smallbone K, Mendes P, Kell D, Paton N: The SuBliMinaL Toolbox: automating steps in the reconstruction of metabolic networks. J Integr Bioinform 2011, 8(2):186. [http://dx.doi.org/10.2390/biecoll-jib-2011-186 webcite]
  文献评价指标  
  下载次数:90次 浏览次数:24次