期刊论文详细信息
BMC Systems Biology
ENNET: inferring large gene regulatory networks from expression data using gradient boosting
Tomasz Arodź1  Janusz Sławek1 
[1] Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia
关键词: Boosting;    Ensemble learning;    Network inference;    Gene regulatory networks;   
Others  :  1142063
DOI  :  10.1186/1752-0509-7-106
 received in 2013-06-24, accepted in 2013-10-17,  发布年份 2013
PDF
【 摘 要 】

Background

The regulation of gene expression by transcription factors is a key determinant of cellular phenotypes. Deciphering genome-wide networks that capture which transcription factors regulate which genes is one of the major efforts towards understanding and accurate modeling of living systems. However, reverse-engineering the network from gene expression profiles remains a challenge, because the data are noisy, high dimensional and sparse, and the regulation is often obscured by indirect connections.

Results

We introduce a gene regulatory network inference algorithm ENNET, which reverse-engineers networks of transcriptional regulation from a variety of expression profiles with a superior accuracy compared to the state-of-the-art methods. The proposed method relies on the boosting of regression stumps combined with a relative variable importance measure for the initial scoring of transcription factors with respect to each gene. Then, we propose a technique for using a distribution of the initial scores and information about knockouts to refine the predictions. We evaluated the proposed method on the DREAM3, DREAM4 and DREAM5 data sets and achieved higher accuracy than the winners of those competitions and other established methods.

Conclusions

Superior accuracy achieved on the three different benchmark data sets shows that ENNET is a top contender in the task of network inference. It is a versatile method that uses information about which gene was knocked-out in which experiment if it is available, but remains the top performer even without such information. ENNET is available for download from https://github.com/slawekj/ennet webcite under the GNU GPLv3 license.

【 授权许可】

   
2013 Sławek and Arodź; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150327210914671.pdf 927KB PDF download
Figure 5. 47KB Image download
Figure 4. 31KB Image download
Figure 3. 40KB Image download
Figure 2. 44KB Image download
Figure 1. 146KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

【 参考文献 】
  • [1]Someren E, Wessels L, Backer E, Reinders M: Genetic network modeling. Pharmacogenomics 2002, 3(4):507-525.
  • [2]Eisen M, Spellman P, Brown P, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci 1998, 95(25):14863.
  • [3]Gardner T, Faith J: Reverse-engineering transcription control networks. Phys Life Rev 2005, 2:65-88.
  • [4]Chen T, He H, Church G, et al.: Modeling gene expression with differential equations. In Pacific Symposium on Biocomputing, Volume 4. Singapore: World Scientific Press; 1999:4-4.
  • [5]D’haeseleer P, Wen X, Fuhrman S, Somogyi R, et al.: Linear modeling of mRNA expression levels during CNS development and injury,. In Pacific Symposium on Biocomputing, Volume 4. Singapore: World Scientific Press; 1999:41-52.
  • [6]Gardner TS, di Bernardo D, Lorenz D, Collins JJ: Inferring genetic networks and identifying compound mode of action via expression profiling. Science 2003, 301(5629):102-105.
  • [7]Yip K, Alexander R, Yan K, Gerstein M: Improved reconstruction of in silico gene regulatory networks by integrating knockout and perturbation data. PLoS One 2010, 5:e8121.
  • [8]Greenfield A, Hafemeister C, Bonneau R: Robust data-driven incorporation of prior knowledge into the inference of dynamic regulatory networks. Bioinformatics 2013, 29(8):1060-1067.
  • [9]Friedman N, Linial M, Nachman I, Pe’er D: Using Bayesian networks to analyze expression data. J Comput Biol 2000, 7(3–4):601-620.
  • [10]Perrin BE, Ralaivola L, Mazurie A, Bottani S, Mallet J, d‘Alche Buc F: Gene networks inference using dynamic Bayesian networks. Bioinformatics 2003, 19(suppl 2):ii138-ii148.
  • [11]Yu J, Smith V, Wang P, Hartemink A, Jarvis E: Advances to Bayesian, network inference for generating causal networks from observational biological data. Bioinformatics 2004, 20(18):3594-3603.
  • [12]Segal E, Wang H, Koller D: Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics 2003, 19(suppl 1):i264-i272.
  • [13]Neapolitan R: Learning Bayesian Networks. Upper Saddle River: Pearson Prentice Hall; 2004.
  • [14]Qi J, Michoel T: Context-specific transcriptional regulatory network inference from global gene expression maps using double two-way t-tests. Bioinformatics 2012, 28(18):2325-2332.
  • [15]Prill R, Marbach D, Saez-Rodriguez J, Sorger P, Alexopoulos L, Xue X, Clarke N, Altan-Bonnet G, Stolovitzky G: Towards a rigorous assessment of systems biology models: the DREAM3 challenges. PLoS One 2010, 5(2):e9202.
  • [16]Bansal M, Belcastro V, Ambesi-Impiombato A, di Bernardo D: How to infer gene networks from expression profiles. Mol Syst Biol 2007, 3:78.
  • [17]Butte A, Kohane I: Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In Pacific Symposium on Biocomputing, Volume 5. Singapore: World Scientific Press; 2000:418-429.
  • [18]Lee W, Tzou W: Computational methods for discovering gene networks from expression data. Brief Bioinform 2009, 10(4):408-423.
  • [19]Markowetz F, Spang R: Inferring cellular networks–a review. BMC Bioinformatics 2007, 8(Suppl 6):S5. BioMed Central Full Text
  • [20]Altay G, Emmert-Streib F: Inferring the conservative causal core of gene regulatory networks. BMC Syst Biol 2010, 4:132. BioMed Central Full Text
  • [21]Küffner R, Petri T, Tavakkolkhah P, Windhager L, Zimmer R: Inferring gene regulatory networks by ANOVA. Bioinformatics 2012, 28(10):1376-1382.
  • [22]Margolin A, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Favera R, Califano A: ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 2006, 7(Suppl 1):S7. BioMed Central Full Text
  • [23]Margolin A, Wang K, Lim W, Kustagi M, Nemenman I, Califano A: Reverse engineering cellular networks. Nat Protoc 2006, 1(2):662-671.
  • [24]Faith J, Hayete B, Thaden J, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins J, Gardner T: Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol 2007, 5:e8.
  • [25]Zhang X, Liu K, Liu ZP, Duval B, Richer JM, Zhao XM, Hao JK, Chen L: NARROMI: a noise and redundancy reduction technique improves accuracy of gene regulatory network inference. Bioinformatics 2013, 29:106-113.
  • [26]Meyer P, Kontos K, Lafitte F, Bontempi G: Information-theoretic inference of large transcriptional regulatory networks. EURASIP J Bioinform Syst Biol 2007, 2007:8.
  • [27]Ding C, Peng H: Minimum redundancy feature selection from microarray gene expression data. In Computational Systems Bioinformatics Conference CSB2003. Washington: IEEE; 2003:523-528.
  • [28]Irrthum A, Wehenkel L, Geurts P, et al.: Inferring regulatory networks from expression data using tree-based methods. PLoS One 2010, 5(9):e12776.
  • [29]Haury AC, Mordelet F, Vera-Licona P, Vert JP: TIGRESS: trustful inference of gene regulation using stability selection. BMC Syst Biol 2012, 6:145. BioMed Central Full Text
  • [30]Freund Y, Schapire RE: Experiments with a new boosting algorithm. International Conference on Machine Learning 1996, 148-156.
  • [31]Freund Y, Schapire RE: A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 1997, 55:119-139.
  • [32]Sławek J, Arodź T: ADANET: inferring gene regulatory networks using ensemble classifiers. In Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine. New York: ACM; 2012:434-441.
  • [33]Lim N, Şenbabaoğlu Y, Michailidis G, d’Alché Buc F: OKVAR-Boost: a novel boosting algorithm to infer nonlinear dynamics and interactions in gene regulatory networks. Bioinformatics 2013, 29(11):1416-1423.
  • [34]Bolstad B, Irizarry R, Åstrand M Speed T: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003, 19(2):185-193.
  • [35]Marbach D, Costello J, Küffner R, Vega N, Prill R, Camacho D, Allison K, Kellis M, Collins J, Stolovitzky G, et al.: Wisdom of crowds for robust gene network inference. Nat Methods 2012, 9(8):797.
  • [36]Theodoridis S, Koutroumbas K: Pattern Recognition. London: Elsevier/Academic Press; 2006.
  • [37]Tuv E, Borisov A, Runger G, Torkkola K: Feature selection with ensembles, artificial variables, and redundancy elimination. J Mach Learn Res 2009, 10:1341-1366.
  • [38]Friedman JH: Greedy function approximation: a gradient boosting machine. Ann Stat 2001, 29(5):1189-1232.
  • [39]Schaffter T, Marbach D, Floreano D: GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics 2011, 27(16):2263-2270.
  • [40]Gama-Castro S, Salgado H, Peralta-Gil M, Santos-Zavaleta A, Muñiz-Rascado L, Solano-Lira H, Jimenez-Jacinto V, Weiss V, García-Sotelo J, López-Fuentes A, et al.: RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (gensor units). Nucleic Acids Res 2011, 39(suppl 1):D98-D105.
  • [41]Kim S, Imoto S, Miyano S: Inferring gene networks from time series microarray data using dynamic Bayesian networks. Brief Bioinform 2003, 4(3):228-235.
  • [42]Di Camillo B, Toffolo G, Cobelli C: A gene network simulator to assess reverse engineering algorithms. Ann N Y Acad Sci 2009, 1158:125-142.
  • [43]Kremling A Fischer S, Gadkar K, Doyle F, Sauter T, Bullinger E, Allgöwer F, Gilles E: A benchmark for fethods in reverse engineering and model discrimination: problem formulation and solutions. Genome Res 2004, 14(9):1773-1785.
  • [44]Mendes P, Sha W, Ye K: Artificial gene networks for objective comparison of analysis algorithms. Bioinformatics 2003, 19(suppl 2):ii122-ii129.
  • [45]Van den Bulcke T, Van Leemput K, Naudts B, Van Remortel P, Ma H, Verschoren A, De Moor B, Marchal K: SynTReN a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinformatics 2006, 7:43. BioMed Central Full Text
  • [46]Ravasz E, Somera A, Mongru D, Oltvai Z, Barabási A: Hierarchical organization of modularity in metabolic networks. Science 2002, 297(5586):1551-1555.
  • [47]Shen-Orr S, Milo R, Mangan S, Alon U: Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet 2002, 31:64-68.
  • [48]Hache H, Wierling C, Lehrach H, Herwig R: GeNGe: systematic generation of gene regulatory networks. Bioinformatics 2009, 25(9):1205-1207.
  • [49]Roy S, Werner-Washburne M, Lane T: A system for generating transcription regulatory networks with combinatorial control of transcription. Bioinformatics 2008, 24(10):1318-1320.
  • [50]Haynes B, Brent M: Benchmarking regulatory network reconstruction with GRENDEL. Bioinformatics 2009, 25(6):801-807.
  • [51]Stolovitzky G, Kundaje A, Held G, Duggar K, Haudenschild C, Zhou D, Vasicek T, Smith K, Aderem A, Roach J: Statistical analysis of MPSS, measurements: application to the study of LPS-activated macrophage gene expression. Proc Natl Acad Sci USA 2005, 102(5):1402-1407.
  • [52]Meyer PE, Lafitte F, Bontempi G: Minet: AR/Bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinformatics 2008, 9:461. BioMed Central Full Text
  • [53]Marbach D, Prill R, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G: Revealing strengths and weaknesses of methods for gene network inference. Proc Natl Acad Sci 2010, 107(14):6286-6291.
  • [54]Marbach D, Schaffter T, Mattiussi C, Floreano D: Generating realistic in silico gene networks for performance assessment of reverse engineering methods. J Comput Biol 2009, 16(2):229-239.
  • [55]Ashburner M, Ball C, Blake J, Botstein D, Butler H, Cherry J, Davis A, Dolinski K, Dwight S, Eppig J, et al.: Gene Ontology: tool for the unification of biology. Nat Genet 2000, 25:25.
  文献评价指标  
  下载次数:104次 浏览次数:79次