IEEE Access | |
Deep Principal Correlated Auto-Encoders With Application to Imaging and Genomics Data Integration | |
Gang Li1  Yu-Ping Wang1  Yi-Pu Zhang1  Chao Wang2  Peng Peng2  De-Peng Han3  Vince D. Calhoun3  | |
[1] School of Electronic and Control Engineering, Chang&x2019;an University, Xi&x2019;an, China; | |
关键词: Classification; data fusion; dimensionality reduction; belief network; optimization algorithm; principal component analysis; | |
DOI : 10.1109/ACCESS.2020.2968634 | |
来源: DOAJ |
【 摘 要 】
In terms of complex diseases like schizophrenia, more and more studies are beginning to treat genetic variants and brain imaging phenotypes as an important factor. In this paper, a competent optimization model is exploited to overcome the weakness of deep canonical correlation analysis (DCCA). The model consists of principal component analysis (PCA) on the multi-modality linear features learning and multi-layer belief networks on multi-modality nonlinear features learning. In order to complete a better result of correlation analysis and classification, the output nodes of multi-layer belief network are used for back propagation (BP) network training. Previous works on solving canonical correlation analysis (CCA) had proposed several models based on deep neural network or regularization, typically involving either some form of norm or auto-encoders with a reconstruction objective. Many existing advanced models had been developed to find the maximal correlation in multi-modality data. However, these multi-modality data tend to have the number of feature dimensions which more than that of samples. Differ from these advanced models, our proposed model is applied to analyze the real set of multi-modality data and test several previous models, then comparing them experimentally on fMRI imaging and SNPs genomics. In experiments, the results show that our model, deep principal correlated auto-encoders (DPCAE), learns features with effectively higher correlation and better performance of classification than those previous models. In terms of classification accuracy, the classification accuracy of the datasets exceeds 90%, but that of the CCA-based model are about 65%, and that of the DNN-based model are about 80%, the classification accuracy of the DPCAE is significantly improved obviously. In the experiment of clustering performance evaluation, the DPCAE further verified its superior classification performance with an average normalized mutual information index of 93.75% and an average classification error rate index of 3.8%. In terms of maximal correlation analysis, the model outperforms other advanced models with a maximal correlation of 0.926, showing excellent performance in high-dimensional data analysis.
【 授权许可】
Unknown