期刊论文详细信息
Journal of Clinical Bioinformatics
A novel tree-based procedure for deciphering the genomic spectrum of clinical disease entities
Philippe Broët1  Wilson Toussile2  Hervé Perdry2  Cyprien Mbogning3 
[1] Assistance Publique – Hôpitaux de Paris, Hôpital Paul Brousse, Villejuif, France;Faculty of Medicine Paris-Sud, 63 rue Gabriel Peri, 94276 Le Kremlin-Bicêtre, France;Inserm U669, 14-16 Avenue Paul-Vaillant-Couturier, 94807 Villejuif, France
关键词: Genomic;    Disease taxonomy;    Lung cancer;    Tree-based regression;    Recursive partitioning;   
Others  :  800939
DOI  :  10.1186/2043-9113-4-6
 received in 2013-12-23, accepted in 2014-04-08,  发布年份 2014
PDF
【 摘 要 】

Background

Dissecting the genomic spectrum of clinical disease entities is a challenging task. Recursive partitioning (or classification trees) methods provide powerful tools for exploring complex interplay among genomic factors, with respect to a main factor, that can reveal hidden genomic patterns. To take confounding variables into account, the partially linear tree-based regression (PLTR) model has been recently published. It combines regression models and tree-based methodology. It is however computationally burdensome and not well suited for situations for which a large number of exploratory variables is expected.

Methods

We developed a novel procedure that represents an alternative to the original PLTR procedure, and considered different selection criteria. A simulation study with different scenarios has been performed to compare the performances of the proposed procedure to the original PLTR strategy.

Results

The proposed procedure with a Bayesian Information Criterion (BIC) achieved good performances to detect the hidden structure as compared to the original procedure. The novel procedure was used for analyzing patterns of copy-number alterations in lung adenocarcinomas, with respect to Kirsten Rat Sarcoma Viral Oncogene Homolog gene (KRAS) mutation status, while controlling for a cohort effect. Results highlight two subgroups of pure or nearly pure wild-type KRAS tumors with particular copy-number alteration patterns.

Conclusions

The proposed procedure with a BIC criterion represents a powerful and practical alternative to the original procedure. Our procedure performs well in a general framework and is simple to implement.

【 授权许可】

   
2014 Mbogning et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140708001650565.pdf 815KB PDF download
Figure 3. 195KB Image download
Figure 6. 18KB Image download
Figure 5. 58KB Image download
Figure 4. 65KB Image download
Figure 3. 22KB Image download
Figure 2. 18KB Image download
Figure 1. 62KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 3.

【 参考文献 】
  • [1]Roberts P, Stinchcombe T: Kras mutation: should we test for it, and does it matter? J Clin Oncol 2013, 31(8):1112-21.
  • [2]Rajagopalan H, Lengauer C: Aneuploidy and cancer. Nature 2004, 432:338-341.
  • [3]Breiman L, Olshen JH, Stone CJ: Classification and Regression Trees. Belmont, California: Wadsworth International Group; 1984.
  • [4]Breiman L: Random forest. Technical Report, Department of Statistics, University of California at Berkeley. 2002
  • [5]Diaz-Uriarte R, Alvarez de Andrés S: Gene selection and classification of microarray data using random forest. BMC Bioinformatics 2006, 7(1):1-13. BioMed Central Full Text
  • [6]Guan X, Chance MR, Barnholtz-Sloan JS: Splitting random forest (srf) for determining compact sets of genes that distinguish between cancer subtypes. J Clin Bioinform 2012, 2(1):1-12.
  • [7]Liaw A, Wiener M: Classification and regression by randomforest. R News 2002, 2(3):18-22.
  • [8]Chen J, Yu K, Hsing A, Therneau TM: A partially linear tree-based regression model for assessing complex joint gene-gene and gene-environment effects. Genet Epidemiol 2007, 31:238-251.
  • [9]Yu K, Wheeler W, Li Q, Bergen AW, Caporaso N, Chatterjee N, Chen J: A partially linear tree-based regression model for multivariate outcomes. Biometrics 2010, 66(1):89-96.
  • [10]Akaike H: A new look at the statistical model identification. IEEE Trans Automat Control 1974, AC-19:716-723.
  • [11]Schwarz G: Estimating the dimension of a model. Ann Stat 1978, 6:461-464.
  • [12]Fan J, Zhang C, Zhang J: Generalized likelihood ratio statistics and wilks phenomenon. Ann Stat 2001, 29(1):153-193.
  • [13]Broët P, Dalmasso C, Tan E, Alifano M, Zhang S, Wu J, Lee M, Régnard J, Lim D, Koong H, Agasthian T, Miller L, Lim E, Camilleri-Broët S, Tan P: Genomic profiles specific to patient ethnicity in lung adenocarcinoma. Clin Cancer Res 2011, 17(11):3542-50.
  • [14]Dalmasso C, Broët P: Detection of chromosomal abnormalities using high resolution arrays in clinical cancer research. J Biomed Inform 2011, 44(6):936-942.
  文献评价指标  
  下载次数:23次 浏览次数:2次