BMC Bioinformatics | |
COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project | |
Frank T Bergmann6  Richard Adams5  Stuart Moodie4  Jonathan Cooper11  Mihai Glont10  Martin Golebiewski8  Michael Hucka1  Camille Laibe10  Andrew K Miller2  David P Nickerson2  Brett G Olivier7  Nicolas Rodriguez9  Herbert M Sauro3  Martin Scharm13  Stian Soiland-Reyes12  Dagmar Waltemath13  Florent Yvon10  Nicolas Le Novère10  | |
[1] Computing and Mathematical sciences, California Institute of Technology, Pasadena 91125, CA, USA | |
[2] Auckland Bioengineering Institute, University of Auckland, Private Bag 92019, Auckland Mail Centre, Auckland 1142, New Zealand | |
[3] Department of Bioengineering, University of Washington, Seattle 98195, WA, USA | |
[4] Current affiliation: Eight Pillars Ltd, 19 Redford Walk, Edinburgh EH13 0AG, UK | |
[5] ResearchSpace, 24 Fountainhall Road, Edinburgh EH9 2LW, UK | |
[6] Modelling of Biological Processes, BioQUANT/COS, University of Heidelberg, INF 267, Heidelberg 69120, Germany | |
[7] Systems Bioinformatics, VU University Amsterdam, Amsterdam 1081 HV, The Netherlands | |
[8] HITS gGmbH, Schloss-Wolfsbrunnenweg 35, Heidelberg, D-69118, Germany | |
[9] Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK | |
[10] European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK | |
[11] Department of Computer Science, University of Oxford, Wolfson Building, Parks Road, Oxford OX1 3QD, UK | |
[12] School of Computer Science, The University of Manchester, Oxford Road, Manchester M13 9PL, UK | |
[13] Systems Biology and Bioinformatics, University of Rostock, Ulmenstrasse 69, Rostock 18057, Germany | |
关键词: Reproducible science; Reproducible research; Computational modeling; Archive; Data format; | |
Others : 1084359 DOI : 10.1186/s12859-014-0369-z |
|
received in 2014-07-04, accepted in 2014-10-30, 发布年份 2014 | |
【 摘 要 】
Background
With the ever increasing use of computational models in the biosciences, the need to share models and reproduce the results of published studies efficiently and easily is becoming more important. To this end, various standards have been proposed that can be used to describe models, simulations, data or other essential information in a consistent fashion. These constitute various separate components required to reproduce a given published scientific result.
Results
We describe the Open Modeling EXchange format (OMEX). Together with the use of other standard formats from the Computational Modeling in Biology Network (COMBINE), OMEX is the basis of the COMBINE Archive, a single file that supports the exchange of all the information necessary for a modeling and simulation experiment in biology. An OMEX file is a ZIP container that includes a manifest file, listing the content of the archive, an optional metadata file adding information about the archive and its content, and the files describing the model. The content of a COMBINE Archive consists of files encoded in COMBINE standards whenever possible, but may include additional files defined by an Internet Media Type. Several tools that support the COMBINE Archive are available, either as independent libraries or embedded in modeling software.
Conclusions
The COMBINE Archive facilitates the reproduction of modeling and simulation experiments in biology by embedding all the relevant information in one file. Having all the information stored and exchanged at once also helps in building activity logs and audit trails. We anticipate that the COMBINE Archive will become a significant help for modellers, as the domain moves to larger, more complex experiments such as multi-scale models of organs, digital organisms, and bioengineering.
【 授权许可】
2014 Bergmann et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150113160932761.pdf | 1070KB | download | |
Figure 3. | 94KB | Image | download |
Figure 2. | 61KB | Image | download |
Figure 1. | 59KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
【 参考文献 】
- [1]Mesirov JP: Computer science: accessible reproducible research. Science 2010, 327:415-416.
- [2]Sandve GK, Nekrutenko A, Taylor J, Hovig E: Ten simple rules for reproducible computational research. PLoS Comput Biol 2013, 9:e1003285.
- [3]Hucka M, Bolouri H, Finney A, Sauro HM, Doyle JC, Kitano H, Arkin AP, Bornstein BJ, Bray D, Cornish-Bowden A, Cuellar AA, Dronov S, Ginkel M, Gor V, Goryanin II, Hedley WJ, Hodgman TC, Hunter PJ, Juty NS, Kasberger JL, Kremling A, Kummer U, Le Novère N, Loew LM, Lucio D, Mendes P, Mjolsness ED, Nakayama Y, Nelson MR, Nielsen PF, et al.: The Systems Biology Markup Language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics 2003, 19:524-531.
- [4]Hedley WJ, Nelson MR, Bullivant DP, Nielsen PF: A short introduction to CellML. Phil Trans Roy Soc London Series A 2001, 359:1073-1089.
- [5]Goddard NH, Hucka M, Howell F, Cornelis H, Shankar K, Beeman D: Towards NeuroML: model description methods for collaborative modelling in neuroscience. Phil Trans Roy Soc London Series B 2001, 356:1209-1228.
- [6]Le Novère N, Finney A, Hucka M, Bhalla US, Campagne F, Collado-Vides J, Crampin EJ, Halstead M, Klipp E, Mendes P, Nielsen P, Sauro H, Shapiro B, Snoep JL, Spence HD, Wanner BL: Minimum Information Requested In the Annotation of biochemical Models (MIRIAM). Nat Biotechnol 2005, 23:1509-1515.
- [7]Waltemath D, Adams R, Beard DA, Bergmann FT, Bhalla US, Britten R, Chelliah V, Cooling MT, Cooper J, Crampin E, Garny A, Hoops S, Hucka M, Hunter P, Klipp E, Laibe C, Miller A, Moraru I, Nickerson D, Nielsen P, Nikolski M, Sahle S, Sauro HM, Schmidt H, Snoep JL, Tolle D, Wolkenhauer O, Le Novère N: Minimum Information About a Simulation Experiment (MIASE). PLoS Comput Biol 2011, 7:e1001122.
- [8]Dada JO, Spasić I, Paton NW, Mendes P: SBRML: a markup language for associating systems biology data with models. Bioinformatics 2010, 26:932-938.
- [9]Waltemath D, Adams R, Bergmann FT, Hucka M, Kolpakov F, Miller AK, Moraru II, Nickerson D, Sahle S, Snoep JL, Le Novère N: Reproducible computational biology experiments with SED-ML – the simulation experiment description markup language. BMC Syst Biol 2011, 5:198. BioMed Central Full Text
- [10]Courtot M, Juty N, Knüpfer C, Waltemath D, Zhukova A, Dräger A, Dumontier M, Finney A, Golebiewski M, Hastings J, Hoops S, Keating S, Kell DB, Kerrien S, Lawson J, Lister A, Lu J, Machne R, Mendes P, Pocock M, Rodriguez N, Villeger A, Wilkinson DJ, Wimalaratne S, Laibe C, Hucka M, Le Novère N: Controlled vocabularies and semantics in systems biology. Mol Syst Biol 2011, 7:543.
- [11]Cooper J, Mirams G, Niederer S: High throughput functional curation of cellular electrophysiology models. Prog Biophys Mol Biol 2011, 107:11-20.
- [12]Numerical Markup Language. http://code.google.com/p/numl/. Accessed 09 April 2014.
- [13]Christie GR, Nielsen PMF, Blackett SA, Bradley CP, Hunter PJ: FieldML: concepts and implementation. Phil Trans R Soc A 2009, 367:1869-1884.
- [14]Le Novère N, Hucka M, Mi H, Moodie S, Shreiber F, Sorokin A, Demir E, Wegner K, Aladjem M, Wimalaratne S, Bergman FT, Gauges R, Ghazal P, Kawaji H, Li L, Matsuoka Y, Villéger A, Boyd SE, Calzone L, Courtot M, Dogrusoz U, Freeman T, Funahashi A, Ghosh S, Jouraku A, Kim S, Kolpakov F, Luna A, Sahle S, Schmidt E, et al.: The systems biology graphical notation. Nat Biotechnol 2009, 27:735-741.
- [15]van Iersel MP, Villéger AC, Czauderna T, Boyd SE, Bergmann FT, Luna A, Demir E, Sorokin A, Dogrusoz U, Matsuoka Y, Funahashi A, Aladjem MI, Mi H, Moodie SL, Kitano H, Le Novère N, Schreiber F: Software support for SBGN maps: SBGN-ML and LibSBGN. Bioinformatics 2012, 28:2016-2021.
- [16]Galdzicki M, Clancy KP, Oberortner E, Pocock M, Quinn JY, Rodriguez CA, Roehner N, Wilson ML, Adam L, Anderson JC, Bartley BA, Beal J, Chandran D, Chen J, Densmore D, Endy D, Grünberg R, Hallinan J, Hillson NJ, Johnson JD, Kuchinsky A, Lux M, Misirli G, Peccoud J, Plahar HA, Sirin E, Stan GB, Villalobos A, Wipat A, Gennari JH, et al.: The Synthetic Biology Open Language (SBOL) provides a community standard for communicating designs in synthetic biology. Nat Biotechnol 2014, 32:545-550.
- [17]Smith LP, Hucka M, Hoops S, Finney A, Ginkel M, Myers CJ, Moraru I, Liebermeister W: Hierarchical Model Composition, Version 1 Release 2. Available from COMBINE 2013 http://identifiers.org/combine.specifications/sbml.level-3.version-1.comp.
- [18]Grimm V, Berger U, Bastiansen F, Eliassen S, Ginot V, Giske J, Goss-Custard J, Grand T, Heinz SK, Huse G, Huth A, Jepsen JU, Jørgensen , Mooij WM, Müller B, Pe’er G, Piou C, Railsback SF, Robbins AM, Robbins MM, Rossmanith E, Rüger N, Strand E, Souissi S, Stillman RA, Vabø R, Visser U, DeAngelis DL: A standard protocol for describing individual-based and agent-based models. Ecol Model 2006, 198:115-126.
- [19]Pharmacometrics Markup Language http://pharmml.org Accessed 12 Feb 2014
- [20]Waltemath D, Henkel R, Hälke R, Scharm M, Wolkenhauer O: Improving the reuse of computational models through version control. Bioinformatics 2013, 29:742-748.
- [21]Li C, Donizelli M, Rodriguez N, Dharuri H, Endler L, Chelliah V, Li L, He E, Henry A, Stefan MI, Snoep JL, Hucka M, Le Novère N, Laibe C: BioModels database: an enhanced, curated and annotated resource for published quantitative kinetic models. BMC Syst Biol 2010, 4:92. BioMed Central Full Text
- [22]Wolstencroft K, Owen S, du Preez F, Krebs O, Mueller W, Goble C, Snoep JL: The SEEK: a platform for sharing data and models in systems biology. Meth Enzymol 2011, 500:629-655.
- [23]Miller AK, Yu T, Britten R, Cooling MT, Lawson J, Cowan D, Garny A, Halstead MD, Hunter PJ, Nickerson DP, Nunns G, Wimalaratne SM, Nielsen PM: Revision history aware repositories of computational models of biological systems. BMC Bioinformatics 2011, 12:22. BioMed Central Full Text
- [24]Henkel R, Le Novère N, Wolkenhauer O, Waltemath D: Considerations of graph-based concepts to manage of computational biology models and associated simulations. GI-Jahrestagung 2012, 2012:1545-1551.
- [25]deb (file format) http://en.wikipedia.org/wiki/Deb_%28file_format%29. Accessed 12 Feb 2014.
- [26]JAR (file format) http://en.wikipedia.org/wiki/JAR_%28file_format%29. Accessed 12 Feb 2014.
- [27][OOXML] Office Open XML file formats. http://en.wikipedia.org/wiki/Office_Open_XML_file_formats. Accessed 12 Feb 2014.
- [28]OpenDocument technical specification http://en.wikipedia.org/wiki/OpenDocument_technical_specification. Accessed 12 Feb 2014.
- [29]The Computational Modeling in Biology Initiative (COMBINE) http://co.mbine.org. Accessed 12 Feb 2014.
- [30]Zip (file format) http://en.wikipedia.org/wiki/Zip_%28file_format%29. Accessed 12 Feb 2014.
- [31][ZipSpec] APPNOTE.TXT - .ZIP File Format Specification http://www.pkware.com/documents/casestudies/APPNOTE.TXT. Accessed 12 Feb 2014.
- [32]Berners-Lee T, Fielding R, Masinter L: Uniform Resource Identifier (URI): Generic Syntax. The Internet Society 2005 http://tools.ietf.org/html/rfc3986. Accessed 12 Feb 2014.
- [33]Juty N, Le Novère N, Laibe C: Identifiers.org and MIRIAM Registry: community resources to provide persistent identification. Nucleic Acids Res 2012, 40:D580-D586.
- [34]Freed N, Klensin J: Media Type Specifications and Registration Procedures.Internet Soc 2005 http://tools.ietf.org/html/rfc4288. Accessed 12 Feb 2014.
- [35]Resource Description Framework (RDF) http://www.w3.org/RDF/. Accessed 13 Feb 2014.
- [36]Perreault S: vCard Format Specification. 2011 http://tools.ietf.org/html/rfc6350. Accessed 13 Feb 2014.
- [37]Iannella R, McKinney J: vCard Ontology. 2013 http://www.w3.org/TR/vcard-rdf/. Accessed 13 Feb 2014.
- [38]DCMI Usage Board: DCMI Metadata Terms. 2012 http://dublincore.org/documents/dcmi-terms/. Accessed 13Feb 2014.
- [39]Wolf M, Wicksteed C: Date and Time Formats. 1997 http://www.w3.org/TR/NOTE-datetime. Accessed 13 Feb 2014.
- [40]Nilsson M, Powell A, Johnston P, Naeve A: Expressing Dublin Core metadata using the Resource Description Framework (RDF) http://dublincore.org/documents/dc-rdf/. Accessed 13 Feb 2014.
- [41]Waltemath D, Bergmann FT, Adams R, Le Novère N: Simulation Experiment Description Markup Language (SED-ML): Level 1 Version 1. 2011 http://identifiers.org/combine.specifications/sed-ml.level-1.version-1. Accessed 14 Feb 2014.
- [42]Olivier BG, Rohwer JM, Hofmeyr JS: Modelling cellular systems with PySCeS. Bioinformatics 2005, 21:560-561.
- [43]Loew LM, Schaff JC: The virtual cell: a software environment for computational cell biology. TRENDS Biotechnol 2001, 19:401-406.
- [44]Integrated Python Based Modeling Environment http://tellurium.analogmachine.org. Accessed 2 July 2014.
- [45]Nickerson D, Buist M: Practical application of CellML 1.1: the integration of new mechanisms into a human ventricular myocyte model. Prog Biophys Mol Biol 2008, 98:38-51.
- [46]FDA center for drug evaluation and research: Model/Data Format. http://www.fda.gov/AboutFDA/CentersOffices/OfficeofMedicalProductsandTobacco/CDER/ucm180482.htm. Accessed 26 February 2014.
- [47]TS-140. The record layout of a data set in SAS transport (XPORT) format. http://support.sas.com/techsup/technote/ts140.html. Accessed 26 February 2014.
- [48]Thiele I, Swainston N, Fleming RMT, Hoppe A, Sahoo S, Aurich MK, Haraldsdottir H, Mo ML, Rolfsson O, Stobbe MD, Thorleifsson SG, Agren R, Bölling C, Bordel S, Chavali AK, Dobson P, Dunn WB, Endler L, Goryanin I, Gudmundsson S, Hala D, Hucka M, Hull D, Jameson D, Jamshidi N, Jonsson JJ, Juty N, Keating S, Nookaew I, Le Novère N, et al.: A community-driven global reconstruction of human metabolism. Nat Biotechnol 2013, 31:419-425.
- [49]Karr JR, Sanghvi JC, Macklin DN, Gutschow MV, Jacobs JM, Bolival B Jr, Assad-Garcia N, Glass JI, Covert MW: A whole-cell computational model predicts phenotype from genotype. Cell 2012, 150:389-401.
- [50]Schliess F, Hoehme S, Henkel SG, Ghallab A, Driesch D, Böttger J, Guthke R, Pfaff M, Hengstler JG, Gebhardt R, Häussinger D, Drasdo D, Zellmer S: Integrated metabolic spatial-temporal model for the prediction of ammonia detoxification during liver damage and regeneration.Hepatology 2014 advanced online publication doi:10.1002/hep.27136
- [51]Mattioni M, Le Novère N: Integration of biochemical and electrical signaling - multiscale model of the medium spiny neuron of the striatum. PLoS One 2013, 8:e66811.
- [52]Functional Curation for Cardiac Electrophysiology https://chaste.cs.ox.ac.uk/FunctionalCuration. Accessed 27 May 2014.
- [53]Scharm M, Wendland F, Peters M, Wolfien M, Theile T, Waltemath D: The CombineArchive Toolkit - facilitating the transfer of research results. PeerJ PrePrints 2:e514v1 http://dx.doi.org/10.7287/peerj.preprints.514v1.
- [54]Butterworth E, Jardine BE, Raymond GM, Neal ML, Bassingthwaighte JB: JSim, an open-source modeling system for data analysis.F1000Research 2013 (doi:10.12688/f1000research.2-288.v1).
- [55]Soiland-Reyes S: Wf4Ever Research Object Bundle 2013 http://purl.org/wf4ever/ro-bundle/2013-05-21/. Accessed 13 Feb 2014.
- [56]Hettne KM, Dharuri H, Garrido J, De Roure D, Corcho O, Klyne G, van Schouwen R, 't Hoen PAC, Bechhofer S, Goble C, Roos M, Zhao J, Wolstencroft K, Belhajjame K, Soiland-Reyes S, Mina E, Thompson M, Cruickshank D, Verdes-Montenegro L: Structuring research methods and data with the Research Object model: genomics workflows as a case study.J Biomed Semant. 2014; 5(41) doi:10.1186/2041-1480-5-41.
- [57]Universal Container Format: https://wikidocs.adobe.com/wiki/display/PDFNAV/Universal+Container+Format. Accessed 27 Feb 2014.
- [58]Soiland-Reyes S, Gamble M: ro-combine-archive 0.1.0. ZENODO (2014). https://github.com/stain/ro-combine-archive doi:10.5281/zenodo.10439.
- [59]EPUB Open Container Format (OCF) 3.0. http://www.idpf.org/epub/30/spec/epub30-ocf.html. Accessed 05 March 2014.
- [60]Open Packaging Convention. http://en.wikipedia.org/wiki/Open_Packaging_Convention. Accessed 05 March 2014.
- [61]Smallbone K: Striking a balance with Recon 2.1. 2014, arXiv:1311.5696