期刊论文详细信息
BMC Systems Biology
TIGRESS: Trustful Inference of Gene REgulation using Stability Selection
Jean-Philippe Vert2  Paola Vera-Licona2  Fantine Mordelet1  Anne-Claire Haury2 
[1] Department of Computer Science, Duke University, Durham, NC 27708, USA;, U900, INSERM, Paris, F-75248, France
关键词: Stability selection;    LARS;    Gene expression data;    Feature selection;    Gene Regulatory Network inference;   
Others  :  1143456
DOI  :  10.1186/1752-0509-6-145
 received in 2012-04-22, accepted in 2012-10-18,  发布年份 2012
PDF
【 摘 要 】

Background

Inferring the structure of gene regulatory networks (GRN) from a collection of gene expression data has many potential applications, from the elucidation of complex biological processes to the identification of potential drug targets. It is however a notoriously difficult problem, for which the many existing methods reach limited accuracy.

Results

In this paper, we formulate GRN inference as a sparse regression problem and investigate the performance of a popular feature selection method, least angle regression (LARS) combined with stability selection, for that purpose. We introduce a novel, robust and accurate scoring technique for stability selection, which improves the performance of feature selection with LARS. The resulting method, which we call TIGRESS (for Trustful Inference of Gene REgulation with Stability Selection), was ranked among the top GRN inference methods in the DREAM5 gene network inference challenge. In particular, TIGRESS was evaluated to be the best linear regression-based method in the challenge. We investigate in depth the influence of the various parameters of the method, and show that a fine parameter tuning can lead to significant improvements and state-of-the-art performance for GRN inference, in both directed and undirected settings.

Conclusions

TIGRESS reaches state-of-the-art performance on benchmark data, including both in silico and in vivo (E. coli and S. cerevisiae) networks. This study confirms the potential of feature selection techniques for GRN inference. Code and data are available on http://cbio.ensmp.fr/tigress webcite. Moreover, TIGRESS can be run online through the GenePattern platform (GP-DREAM, http://dream.broadinstitute.org webcite).

【 授权许可】

   
2012 Haury et al; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150329082201144.pdf 1378KB PDF download
Figure 16. 100KB Image download
Figure 15. 80KB Image download
Figure 14. 39KB Image download
Figure 13. 123KB Image download
Figure 12. 32KB Image download
Figure 11. 47KB Image download
Figure 10. 41KB Image download
Figure 9. 39KB Image download
Figure 8. 51KB Image download
Figure 7. 49KB Image download
Figure 6. 35KB Image download
Figure 5. 40KB Image download
Figure 4. 61KB Image download
Figure 3. 82KB Image download
Figure 2. 133KB Image download
Figure 1. 119KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

Figure 11.

Figure 12.

Figure 13.

Figure 14.

Figure 15.

Figure 16.

【 参考文献 】
  • [1]Arkin A, Shen P, Ross J: A test case of correlation metric construction of a reaction pathway from measurements. Science 1997, 277(5330):1275-1279. [http://www.sciencemag.org/cgi/reprint/277/5330/1275.pdf webcite]
  • [2]Liang S, Fuhrman S, Somogyi R: REVEAL, a general reverse engineering algorithm for inference of genetic network architectures. Pac Symp Biocomput 1998, 3:18-29.
  • [3]Chen T, He HL, Church GM: Modeling gene expression with differential equations. Pac Symp Biocomput 1999, 4:29-40.
  • [4]Akutsu T, Miyano S, Kuhara S: Algorithms for identifying Boolean networks and related biological networks based on matrix multiplication and fingerprint function. J Comput Biol 2000, 7(3-4):331-343.
  • [5]Yeung MKS, Tegnér J, Collins JJ: Reverse engineering gene networks using singular value decomposition and robust regression. Proc Natl Acad Sci USA 2002, 99(9):6163-6168. [http://www.pnas.org/content/99/9/6163.abstract webcite]
  • [6]Tegner J, Yeung MKS, Hasty J, Collins JJ: Reverse engineering gene networks: integrating genetic perturbations with dynamical modeling. Proc Natl Acad Sci USA 2003, 100(10):5944-5949.
  • [7]Gardner TS, Bernardo D, Lorenz D, Collins JJ: Inferring genetic networks and identifying compound mode of action via expression profiling. Science 2003, 301(5629):102-105.
  • [8]Chen KC, Wang TY, Tseng HH, Huang CYF, Kao CY: A stochastic differential equation model for quantifying transcriptional regulatory network in Saccharomyces cerevisiae. Bioinformatics 2005, 21(12):2883-2890.
  • [9]Bernardo D, Thompson MJ, Gardner TS, Chobot SE, Eastwood EL, Wojtovich AP, Elliott SJ, Schaus SE, Collins JJ: Chemogenomic profiling on a genome-wide scale using reverse-engineered gene networks. Nat Biotechnol 2005, 23(3):377-383.
  • [10]Bansal M, Della Gatta G, Bernardo D: Inference of gene regulatory networks and compound mode of action from time course gene expression profiles. Bioinformatics 2006, 22(7):815-822.
  • [11]Zoppoli P, Morganella S, Ceccarelli M: TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach. BMC Bioinformatics 2010, 11:154. BioMed Central Full Text
  • [12]Butte AJ, Tamayo P, Slonim D, Golub TR, Kohane IS: Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proc Natl Acad Sci USA 2000, 97(22):12182-12186.
  • [13]Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A: ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular contexts. BMC Bioinformatics 2006, 7 Suppl 1:S7. BioMed Central Full Text
  • [14]Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS: Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol 2007, 5:e8.
  • [15]Rice J, Tu Y, Stolovitzky G: Reconstructing biological networks using conditional correlation analysis. Bioinformatics 2005, 21(6):765-773.
  • [16]Friedman N, Linial M, Nachman I, Pe’er D: Using Bayesian networks to analyze expression data. J Comput Biol 2000, 7(3-4):601-620.
  • [17]Hartemink A, Gifford D, Jaakkola T, Young R: Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. In Proceedings of the Pacific Symposium on Biocomputing 2002. Edited by Altman RB, Dunker AK, Hunter L, Lauerdale K, Klein TE. World Scientific; 2002:422-433. [http://helix-web.stanford.edu/psb01/abstracts/p422.html webcite]
  • [18]Perrin B, Ralaivola L, Mazurie A, Bottani S, Mallet J, d’Alche Buc F: Gene networks inference using dynamic Bayesian networks. Bioinformatics 2003, 19(suppl 2):ii138-ii148.
  • [19]Friedman N: Inferring cellular networks using probabilistic graphical models. Science 2004, 303(5659):799.
  • [20]Huynh-Thu VA, Irrthum A, Wehenkel L, Geurts P: Inferring regulatory networks from expression data using tree-based methods. PLoS One 2010, 5(9):e12776.
  • [21]Markowetz F, Spang R: Inferring cellular networks - a review. BMC Bioinformatics 2007, 8(Suppl 6):S5. [http://www.biomedcentral.com/1471-2105/8/S6/S5 webcite] BioMed Central Full Text
  • [22]Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G: Revealing strengths and weaknesses of methods for gene network inference. Proc Natl Acad Sci USA 2010, 107(14):6286-6291. http://www.pnas.org/content/107/14/6286.abstract webcite
  • [23]Meinshausen N, Bühlmann P: High dimensional graphs and variable selection with the Lasso. Ann Stat 2006, 34:1436-1462.
  • [24]Efron B, Hastie T, Johnstone I, Tibshirani R: Least angle regression. Ann. Stat. 2004, 32(2):407-499.
  • [25]Bach FR: Bolasso: model consistent Lasso estimation through the bootstrap. In Proceedings of theth international conference on Machine learning Volume 308 of ACM International Conference Proceeding Series. Edited by Cohen WW, McCallum A, Roweis ST. ACM, New York, NY, USA; 2008:33-40.
  • [26]Meinshausen N, Bühlmann P: Stability selection. J R Stat Soc Ser B 2010, 72(4):417-473.
  • [27]Tibshirani R: Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 1996, 58:267-288.
  • [28]Marbach D, Costello J, Küffner R, Vega N, Prill R, Camacho D, Allison K, Kellis M, Collins J, Stolovitzky G, the DREAM5 Consortium: Wisdom of crowds for robust gene network inference. Nat Methods 2012, 9(8):796-804.
  • [29]Breiman L: Random forests. Mach Learn 2001, 45:5-32.
  • [30]Weisberg S: Applied linear regression. New-York, Wiley; 1981.
  • [31]Hastie T, Tibshirani R, Friedman J: The elements of statistical learning: data mining, inference, and prediction. 2001.
  • [32]Mairal J, Bach F, Ponce J, Sapiro G: Online Learning for Matrix Factorization and Sparse Coding. J Mach Learn Res 2010, 11:19-60. [http://jmlr.csail.mit.edu/papers/v11/mairal10a.html webcite]
  • [33]Schaffter T, Marbach D, Floreano D: GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics 2011, 27(16):2263-2270. [http://bioinformatics.oxfordjournals.org/content/27/16/2263.abstract webcite]
  • [34]Marbach D, Schaffter T, Mattiussi C, Floreano D: Generating realistic in silico gene networks for performance assessment of reverse engineering methods. J Comput Biol 2009, 16(2):229-239. [http://online.liebertpub.com/doi/abs/10.1089/cmb.2008.09TT webcite]
  • [35]Faith J, Driscoll M, Fusaro V, Cosgrove E, Hayete B, Juhn F, Schneider S, Gardner T: Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata. Nucleic Acids Res 2008, 36(Database issue):D866—D870.
  • [36]Gama-Castro S, Salgado H, Peralta-Gil M, Santos-Zavaleta A, Muñiz-Rascado L, Solano-Lira H, Jimenez-Jacinto V, Weiss V, García-Sotelo JS, López-Fuentes A, Porrón-Sotelo L, Alquicira-Hernández S, Medina-Rivera A, Martínez-Flores I, Alquicira-Hernández K, Martínez-Adame R, Bonavides-Martínez C, Miranda-Ríos J, Huerta AM, Mendoza-Vargas A, Collado-Torres L, Taboada B, Vega-Alvarado L, Olvera M, Olvera L, Grande R, Morett E, Collado-Vides J: RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (Gensor Units). Nucleic Acids Res 2011, 39(suppl 1):D98—D105. [http://nar.oxfordjournals.org/content/39/suppl_1/D98.abstract webcite]
  • [37]Küffner R, Petri T, Tavakkolkhah P, Windhager L, Zimmer R: Inferring gene regulatory networks by ANOVA. Bioinformatics 2012, 28(10):1376-1382.
  • [38]Mordelet F, Vert JP: SIRENE: Supervised inference of regulatory networks. Bioinformatics 2008, 24(16):i76—i82.
  文献评价指标  
  下载次数:138次 浏览次数:30次