BMC Bioinformatics | |
An embedded gene selection method using knockoffs optimizing neural network | |
Juncheng Guo1  Jianxiao Liu2  Yuanyuan Chen3  Min Jin3  | |
[1] Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, 430070, Wuhan, China;Institute of Information Engineering, Chinese Academy of Sciences, 10049, Beijing, China;School of Cyber Security, University of Chinese Academy of Sciences, 10049, Beijing, China;Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, 430070, Wuhan, China;National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, 430070, Wuhan, China;National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, 430070, Wuhan, China; | |
关键词: Gene mining; Neural network; Knockoffs; Nonlinear data; Maize; | |
DOI : 10.1186/s12859-020-03717-w | |
来源: Springer | |
【 摘 要 】
BackgroundGene selection refers to find a small subset of discriminant genes from the gene expression profiles. How to select genes that affect specific phenotypic traits effectively is an important research work in the field of biology. The neural network has better fitting ability when dealing with nonlinear data, and it can capture features automatically and flexibly. In this work, we propose an embedded gene selection method using neural network. The important genes can be obtained by calculating the weight coefficient after the training is completed. In order to solve the problem of black box of neural network and further make the training results interpretable in neural network, we use the idea of knockoffs to construct the knockoff feature genes of the original feature genes. This method not only make each feature gene to compete with each other, but also make each feature gene compete with its knockoff feature gene. This approach can help to select the key genes that affect the decision-making of neural networks.ResultsWe use maize carotenoids, tocopherol methyltransferase, raffinose family oligosaccharides and human breast cancer dataset to do verification and analysis.ConclusionsThe experiment results demonstrate that the knockoffs optimizing neural network method has better detection effect than the other existing algorithms, and specially for processing the nonlinear gene expression and phenotype data.
【 授权许可】
CC BY
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202104249065003ZK.pdf | 1414KB | download |