期刊论文详细信息
BMC Bioinformatics
A unified computational model for revealing and predicting subtle subtypes of cancers
Xianwen Ren1  Yong Wang3  Jiguang Wang2  Xiang-Sun Zhang3 
[1] MOH Key Laboratory of Systems Biology of Pathogens, Institute of Pathogen Biology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, China
[2] Beijing Institute of Genomics, Chinese Academy of Sciences, 7 Beitucheng West Road, Beijing 100029, China
[3] National Center for Mathematics and Interdisciplinary Sciences, Chinese Academy of Sciences, Beijing 100190, China
关键词: Cancer;    Quadratic programming;    Class prediction;    Class discovery;   
Others  :  1088300
DOI  :  10.1186/1471-2105-13-70
 received in 2011-12-10, accepted in 2012-04-04,  发布年份 2012
PDF
【 摘 要 】

Background

Gene expression profiling technologies have gradually become a community standard tool for clinical applications. For example, gene expression data has been analyzed to reveal novel disease subtypes (class discovery) and assign particular samples to well-defined classes (class prediction). In the past decade, many effective methods have been proposed for individual applications. However, there is still a pressing need for a unified framework that can reveal the complicated relationships between samples.

Results

We propose a novel convex optimization model to perform class discovery and class prediction in a unified framework. An efficient algorithm is designed and software named OTCC (Optimization Tool for Clustering and Classification) is developed. Comparison in a simulated dataset shows that our method outperforms the existing methods. We then applied OTCC to acute leukemia and breast cancer datasets. The results demonstrate that our method not only can reveal the subtle structures underlying those cancer gene expression data but also can accurately predict the class labels of unknown cancer samples. Therefore, our method holds the promise to identify novel cancer subtypes and improve diagnosis.

Conclusions

We propose a unified computational framework for class discovery and class prediction to facilitate the discovery and prediction of subtle subtypes of cancers. Our method can be generally applied to multiple types of measurements, e.g., gene expression profiling, proteomic measuring, and recent next-generation sequencing, since it only requires the similarities among samples as input.

【 授权许可】

   
2012 Ren et al; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150117094545793.pdf 793KB PDF download
Figure 7 . 55KB Image download
Figure 6 . 47KB Image download
Figure 5 . 44KB Image download
Figure 4 . 24KB Image download
Figure 3 . 17KB Image download
Figure 2 . 53KB Image download
Figure 1 . 36KB Image download
【 图 表 】

Figure 1 .

Figure 2 .

Figure 3 .

Figure 4 .

Figure 5 .

Figure 6 .

Figure 7 .

【 参考文献 】
  • [1]Bals R, Jany B: Identification of disease genes by expression profiling. Eur Respir J 2001, 18(5):882-889.
  • [2]Greenberg SA: DNA microarray gene expression analysis technology and its application to neurological disorders. Neurology 2001, 57(5):755-761.
  • [3]Henriksen PA, Kotelevtsev Y: Application of gene expression profiling to cardiovascular disease. Cardiovasc Res 2002, 54(1):16-24.
  • [4]Lagraulet A: Current Clinical and Pharmaceutical Applications of Microarrays: From Disease Biomarkers Discovery to Automated Diagnostics. J Assoc Lab Autom 2010, 15(5):405-413.
  • [5]Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, et al.: Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 1999, 286(5439):531-537.
  • [6]Brunet J-P, Tamayo P, Golub TR, Mesirov JP: Metagenes and molecular pattern discovery using matrix factorization. Proc Nat Acad Sci USA 2004, 101(12):4164-4169.
  • [7]Gao Y, Church G: Improving molecular cancer class discovery through sparse non-negative matrix factorization. Bioinformatics 2005, 21(21):3970-3975.
  • [8]Hsu AL, Tang S-L, Halgamuge SK: An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data. Bioinformatics 2003, 19(16):2131-2140.
  • [9]Kim H, Park H: Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 2007, 23(12):1495-1502.
  • [10]Li W, Fan M, Xiong M: SamCluster: an integrated scheme for automatic discovery of sample classes using gene expression profile. Bioinformatics 2003, 19(7):811-817.
  • [11]Steinfeld I, Navon R, Ardigo D, Zavaroni I, Yakhini Z: Clinically driven semi-supervised class discovery in gene expression data. Bioinformatics 2008, 24(16):i90-i97.
  • [12]Varma S, Simon R: Iterative class discovery and feature selection using Minimal Spanning Trees. BMC Bioinforma 2004, 5:126. BioMed Central Full Text
  • [13]von Heydebreck A, Huber W, Poustka A, Vingron M: Identifying splits with clear separation: a new class discovery method for gene expression data. Bioinformatics 2001, 17(suppl 1):S107-S114.
  • [14]Yu Z, Wong H-S, Wang H: Graph-based consensus clustering for class discovery from gene expression data. Bioinformatics 2007, 23(21):2888-2896.
  • [15]Brown MPS, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, Ares M, Haussler D: Knowledge-based analysis of microarray gene expression data by using support vector machines. ProcNat Acad Sci USA 2000, 97(1):262-267.
  • [16]Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 2000, 16(10):906-914.
  • [17]Ji Y, Tsui K-W, Kim K: A novel means of using gene clusters in a two-step empirical Bayes method for predicting classes of samples. Bioinformatics 2005, 21(7):1055-1061.
  • [18]Lee Y, Lee C-K: Classification of multiple cancer types by multicategory support vector machines using gene expression data. Bioinformatics 2003, 19(9):1132-1139.
  • [19]Tan AC, Naiman DQ, Xu L, Winslow RL, Geman D: Simple decision rules for classifying human cancers from gene expression profiles. Bioinformatics 2005, 21(20):3896-3904.
  • [20]Alexandridis R, Lin S, Irwin M: Class discovery and classification of tumor samples using mixture modeling of gene expression data}a unified approach. Bioinformatics 2004, 20(16):2545-2552.
  • [21]Filippone M, Camastra F, Masulli F, Rovetta S: Asurvey of kernel and spectral methods for clustering. Pattern Recognit 2007, 41:176-190.
  • [22]von Luxburg U: A Tutorial on Spectral Clustering. Stat Comput 2007, 17:395-416.
  • [23]Hwang T, Sicotte H, Tian Z, Wu B, Kocher J-P, Wigle DA, Kumar V, Kuang R: Robust and efficient identification of biomarkers by classifying features on graphs. Bioinformatics 2008, 24(18):2023-2029.
  • [24]Frey BJ, Dueck D: Clustering by Passing Messages Between Data Points. Science 2007, 315(5814):972-976.
  • [25]Casey T, Bond J, Tighe S, Hunter T, Lintault L, Patel O, Eneman J, Crocker A, White J, Tessitore J, et al.: Molecular signatures suggest a major role for stromal cells in development of invasive breast cancer. Breast Cancer Res Treat 2009, 114(1):47-62.
  • [26]Kim C, Cheon M, Kang M, Chang I: A simple and exact Laplacian clustering of complex networking phenomena: Application to gene expression profiles. Proc Nat Acad Sci USA 2008, 105(11):4083-4087.
  • [27]Macqueen JB: Some Methods for classification and analysis of multivariate observations. In: 1967. University of California Press, Berkeley; 1967:281-297.
  • [28]Lloyd S: Least squares quantization in PCM. Inf Theory, IEEE Trans on 1982, 28(2):129-137.
  • [29]Pavlopoulos GA, Moschopoulos CN, Hooper SD, Schneider R, Kossida S: jClust: A clustering and visualization toolbox. Bioinformatics 2009, 25(15):1994-1996.
  • [30]Yang C, Zhang X, Jiao L, Wang G: Self-Tuning Semi-Supervised Spectral Clustering. Comput Intell Secur, Int Conf on 2008, 1:1-5.
  • [31]Mishra A, Gillies D: Semi Supervised Spectral Clustering for Regulatory Module Discovery. In Data Integration in the Life Sciences. Edited by Bairoch A, Cohen-Boulakia S, Froidevaux C. Berlin/Heidelberg, Springer-Verlag; 2008:192-203. vol. 5109
  文献评价指标  
  下载次数:90次 浏览次数:30次