JOURNAL OF CHEMICAL ENGINEERING OF JAPAN | |
Classification and Diagnostic Output Prediction of Cancer Using Gene Expression Profiling and Supervised Machine Learning Algorithms | |
Krist V. Gernaey2  Changkyoo Yoo1  | |
[1] College of Environment and Applied Chemistry, Green Energy Center/Center for Environmental Studies, Kyung Hee University;Department of Chemical Engineering, Technical University of Denmark | |
关键词: Bioinformatics; Data Mining; Gene Expression Profiling; Cancer Classification; Supervised Clustering; | |
DOI : 10.1252/jcej.08we042 | |
来源: Maruzen Company Ltd | |
【 摘 要 】
References(63)Cited-By(3)In this paper, a new supervised clustering and classification method is proposed. First, the application of discriminant partial least squares (DPLS) for the selection of a minimum number of key genes is applied on a gene expression microarray data set. Second, supervised hierarchical clustering based on the information of the cancer type is subsequently proposed to find key gene groups and to group the cancer samples into different subclasses. Here, the weights of the genes in the DPLS are proportional to their importance in the determination of the class labels, that is, the variable importance in the projection (VIP) information of the DPLS method. The power of the gene selection method and the proposed supervised hierarchical clustering method is illustrated on a three microarray data sets of leukemia, breast, and colon cancer. Supervised machine learning algorithms thus enable the subtype classification 3 data sets solely on the basis of molecular-level monitoring. Compared to unsupervised clustering, the supervised method performed better for discriminating between cancer types and cancer subtypes for the leukemia data set. The performance of the proposed method, using only a limited set of informative genes, is demonstrated to be comparable or better than results reported in the literature for the three data sets. Furthermore the method was successful in predicting the outcome of medical treatment (success or failure) based on the microarray data, which could make the method an important tool for clinical doctors.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201912080696290ZK.pdf | 684KB | download |