BMC Bioinformatics | |
Multi-resolution independent component analysis for high-performance tumor classification and biomarker discovery | |
Research | |
Henry Han1  Xiao-Li Li2  | |
[1] Center for Computational Medicine and Bioinformatics, University of Michigan, 48109, Ann Arbor, MI, USA;Departament of Mathematics and Bioinformatics, Eastern Michigan University, 48197, Ypsilanti, MI, USA;Institute for Infocomm Research, Agency for Science, Technology and Research (A*STAR), 138632, Singapore, USA; | |
关键词: Support Vector Machine; Feature Selection; Linear Discriminant Analysis; Independent Component Analysis; Discrete Wavelet Transform; | |
DOI : 10.1186/1471-2105-12-S1-S7 | |
来源: Springer | |
【 摘 要 】
BackgroundAlthough high-throughput microarray based molecular diagnostic technologies show a great promise in cancer diagnosis, it is still far from a clinical application due to its low and instable sensitivities and specificities in cancer molecular pattern recognition. In fact, high-dimensional and heterogeneous tumor profiles challenge current machine learning methodologies for its small number of samples and large or even huge number of variables (genes). This naturally calls for the use of an effective feature selection in microarray data classification.MethodsWe propose a novel feature selection method: multi-resolution independent component analysis (MICA) for large-scale gene expression data. This method overcomes the weak points of the widely used transform-based feature selection methods such as principal component analysis (PCA), independent component analysis (ICA), and nonnegative matrix factorization (NMF) by avoiding their global feature-selection mechanism. In addition to demonstrating the effectiveness of the multi-resolution independent component analysis in meaningful biomarker discovery, we present a multi-resolution independent component analysis based support vector machines (MICA-SVM) and linear discriminant analysis (MICA-LDA) to attain high-performance classifications in low-dimensional spaces.ResultsWe have demonstrated the superiority and stability of our algorithms by performing comprehensive experimental comparisons with nine state-of-the-art algorithms on six high-dimensional heterogeneous profiles under cross validations. Our classification algorithms, especially, MICA-SVM, not only accomplish clinical or near-clinical level sensitivities and specificities, but also show strong performance stability over its peers in classification. Software that implements the major algorithm and data sets on which this paper focuses are freely available at https://sites.google.com/site/heyaumapbc2011/.ConclusionsThis work suggests a new direction to accelerate microarray technologies into a clinical routine through building a high-performance classifier to attain clinical-level sensitivities and specificities by treating an input profile as a ‘profile-biomarker’. The multi-resolution data analysis based redundant global feature suppressing and effective local feature extraction also have a positive impact on large scale ‘omics’ data mining.
【 授权许可】
CC BY
© Han and Li; licensee BioMed Central Ltd. 2011
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202311096730866ZK.pdf | 568KB | download |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]