期刊论文详细信息
BMC Bioinformatics
KNIME-CDK: Workflow-driven cheminformatics
Christoph Steinbeck3  Michael Berthold1  Luis F de Figueiredo3  Bernd Wiswedel2  Thorsten Meinl1  Stephan Beisken3 
[1]Nycomed Chair for Bioinformatics and Information Mining, University of Konstanz, Konstanz, Germany
[2]KNIME.com AG, Technoparkstr. 1, 8005 Zürich, Switzerland
[3]European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
关键词: Software library;    Data integration;    Workflows;    Cheminformatics;   
Others  :  1087779
DOI  :  10.1186/1471-2105-14-257
 received in 2013-01-15, accepted in 2013-08-21,  发布年份 2013
PDF
【 摘 要 】

Background

Cheminformaticians have to routinely process and analyse libraries of small molecules. Among other things, that includes the standardization of molecules, calculation of various descriptors, visualisation of molecular structures, and downstream analysis. For this purpose, scientific workflow platforms such as the Konstanz Information Miner can be used if provided with the right plug-in. A workflow-based cheminformatics tool provides the advantage of ease-of-use and interoperability between complementary cheminformatics packages within the same framework, hence facilitating the analysis process.

Results

KNIME-CDK comprises functions for molecule conversion to/from common formats, generation of signatures, fingerprints, and molecular properties. It is based on the Chemistry Development Toolkit and uses the Chemical Markup Language for persistence. A comparison with the cheminformatics plug-in RDKit shows that KNIME-CDK supports a similar range of chemical classes and adds new functionality to the framework. We describe the design and integration of the plug-in, and demonstrate the usage of the nodes on ChEBI, a library of small molecules of biological interest.

Conclusions

KNIME-CDK is an open-source plug-in for the Konstanz Information Miner, a free workflow platform. KNIME-CDK is build on top of the open-source Chemistry Development Toolkit and allows for efficient cross-vendor structural cheminformatics. Its ease-of-use and modularity enables researchers to automate routine tasks and data analysis, bringing complimentary cheminformatics functionality to the workflow environment.

【 授权许可】

   
2013 Beisken et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150117042747502.pdf 485KB PDF download
Figure 1. 77KB Image download
【 图 表 】

Figure 1.

【 参考文献 】
  • [1]Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E: The Chemistry Development Kit (CDK): an open-source Java library for Chemo- and Bioinformatics. J Chem Inf Comput Sci 2003, 43(2):493-500. [http://www.ncbi.nlm.nih.gov/pubmed/16796559 webcite]
  • [2]Landrum G: RDKit: Open-source cheminformatics. [http://www.rdkit.org/ webcite]
  • [3]O’Boyle NM, Banck M, James Ca, Morley C, Vandermeersch T, Hutchison GR: Open Babel: An open chemical toolbox. J Cheminformatics 2011, 3:33. [http://www.ncbi.nlm.nih.gov/pubmed/21982300 webcite] BioMed Central Full Text
  • [4]Le Guilloux V, Colliandre L, Bourg S, Guenegou G, Dubois-Chevalier J, Morin-Allory L: Visual characterization and diversity quantification of chemical libraries. 1) Creation of delimited reference chemical subspaces. J Chem Inf Model 2011, 51(8):1762-74. [http://www.ncbi.nlm.nih.gov/pubmed/21761916 webcite]
  • [5]Magalhaes WCS, Machado M, Tarazona-santos E: A graph-based approach for designing extensible pipelines. BMC Bioinf 2012, 13(163):163.
  • [6]Warr Wa: Scientific workflow systems: pipeline pilot and KNIME. J Comput-aided Mol Des 2012, 26(7):801-4. [http://www.ncbi.nlm.nih.gov/pubmed/22644661 webcite]
  • [7]Berthold MR, Cebron N, Dill F, Gabriel TR, Kötter T, Meinl T, Ohl P, Sieb C, Thiel K, Wiswedel B: KNIME: The Konstanz Information Miner. In Studies in Classification, Data Analysis, and Knowledge Organization (GfKL 2007). Heidelberg-Berlin: Springer-Verlag; 2007.
  • [8]Jagla B, Wiswedel B, Coppée JY: Extending KNIME for next-generation sequencing data analysis. Bioinf (Oxford, England) 2011, 27(20):2907-9. [http://www.ncbi.nlm.nih.gov/pubmed/21873641 webcite]
  • [9]Lindenbaumm P, Le Scouarnec S, Portero V, Redon R: Knime4Bio: a set of custom nodes for the interpretation of next-generation sequencing data with KNIME. Bioinf (Oxford, England) 2011, 27(22):3200-1. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3208396 webcite]
  • [10]Strobelt H, Bertini E, Braun J, Deussen O, Groth U, Mayer TU, Merhof D: HiTSEE KNIME: a visualization tool for hit selection and analysis in high-throughput screening experiments for the KNIME platform. BMC Bioinf 2012, 13 Suppl 8(Suppl 8):S4. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3355333 webcite]
  • [11]KNIME: KNIME - Professional open-source software. [http://www.knime.com/ webcite]
  • [12]Hastings J, de sMatos P, Dekker a, Ennis M, Harsha B, Kale N, Muthukrishnan V, Owen G, Turner S, Williams M, Steinbeck C: The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013. Nucleic Acids Res 2012, 41(November 2012):456-463. [http://www.nar.oxfordjournals.org/cgi/doi/10.1093/nar/gks1146 webcite]
  • [13]KNIME: KNIME Community site. [http://tech.knime.org/community/cdk webcite]
  • [14]Kuhn S, Helmus T, Lancashire RJ, Murray-Rust P, Rzepa HS, Steinbeck C, Willighagen EL: Chemical markup, XML, and the world wide web. 7. CMLSpect, an XML vocabulary for spectral data. J Chem Inf Model 2007, 47(6):2015-34. [http://www.ncbi.nlm.nih.gov/pubmed/17887743 webcite]
  • [15]Warr W: Representation of chemical structures. Wiley Interdisciplinary Rev: Comput 2011, 1(August):557-579. [http://onlinelibrary.wiley.com/doi/10.1002/wcms.36/full webcite]
  • [16]Lowe DM, Corbett PT, Murray-Rust P, Glen RC: Chemical name to structure: OPSIN, an open source solution. J Chem Inf Model 2011, 51(3):739-53. [http://www.ncbi.nlm.nih.gov/pubmed/21384929 webcite]
  • [17]MyExperiment: MyExperiment KNIME workflow. [http://www.myexperiment.org/workflows/3045.html webcite]
  • [18]Goble Ca, Bhagat J, Aleksejevs S, Cruickshank D, Michaelides D, Newman D, Borkum M, Bechhofer S, Roos M, Li P, De Roure D: myExperiment: a repository and social network for the sharing of bioinformatics workflows. Nucleic Acids Res 2010, 38(Web Server issue):W677-82. [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2896080 webcite]
  • [19]Krause S, Willighagen E, Steinbeck C: JChemPaint-using the collaborative forces of the internet to develop a free editor for 2D chemical structures. Molecules 2000, 5(1):93-98. [http://www.mdpi.com/1420-3049/5/1/93 webcite]
  文献评价指标  
  下载次数:41次 浏览次数:15次