BMC Bioinformatics | |
Optimising orbit counting of arbitrary order by equation selection | |
  1    2    3    4    4    4  | |
[1] 0000 0001 2069 7798, grid.5342.0, Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium;0000000104788040, grid.11486.3a, Department of Plant Systems Biology, VIB, Technologiepark 927, 9052, Ghent, Belgium;0000 0001 2069 7798, grid.5342.0, Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052, Ghent, Belgium;0000 0001 2069 7798, grid.5342.0, Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium;0000000104788040, grid.11486.3a, Department of Plant Systems Biology, VIB, Technologiepark 927, 9052, Ghent, Belgium;0000 0001 2069 7798, grid.5342.0, Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052, Ghent, Belgium;0000 0001 2107 2298, grid.49697.35, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria 0028, South Africa;0000 0001 2069 7798, grid.5342.0, Ghent University - imec, IDLab, Technologiepark 15, 9052, Ghent, Belgium;0000 0001 2069 7798, grid.5342.0, Ghent University - imec, IDLab, Technologiepark 15, 9052, Ghent, Belgium;0000 0001 2069 7798, grid.5342.0, Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium; | |
关键词: Graph theory; Graphlets; Orbits; Equations; Optimisation; Cytoscape app; | |
DOI : 10.1186/s12859-018-2483-9 | |
来源: publisher | |
【 摘 要 】
BackgroundGraphlets are useful for bioinformatics network analysis. Based on the structure of Hočevar and Demšar’s ORCA algorithm, we have created an orbit counting algorithm, named Jesse. This algorithm, like ORCA, uses equations to count the orbits, but unlike ORCA it can count graphlets of any order. To do so, it generates the required internal structures and equations automatically. Many more redundant equations are generated, however, and Jesse’s running time is highly dependent on which of these equations are used. Therefore, this paper aims to investigate which equations are most efficient, and which factors have an effect on this efficiency.ResultsWith appropriate equation selection, Jesse’s running time may be reduced by a factor of up to 2 in the best case, compared to using randomly selected equations. Which equations are most efficient depends on the density of the graph, but barely on the graph type. At low graph density, equations with terms in their right-hand side with few arguments are more efficient, whereas at high density, equations with terms with many arguments in the right-hand side are most efficient. At a density between 0.6 and 0.7, both types of equations are about equally efficient.ConclusionsOur Jesse algorithm became up to a factor 2 more efficient, by automatically selecting the best equations based on graph density. It was adapted into a Cytoscape App that is freely available from the Cytoscape App Store to ease application by bioinformaticians.
【 授权许可】
CC BY
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201909247898254ZK.pdf | 1770KB | download |