期刊论文详细信息
Journal of Biological Engineering
RN-Autoencoder: Reduced Noise Autoencoder for classifying imbalanced cancer genomic data
Research
Marwa Radad1  Ahmed Arafa1  Mohammed Badawy1  Nawal El-Fishawy1 
[1] Faculty of Electronic Engineering, Menoufia University, Box No. 32951, El-Gish Street, Menouf, Menoufia, Egypt;
关键词: RN-Autoencoder;    Cancer Classification;    Gene Expressions;    Imbalanced Classification;    RN-SMOTE;    Dimensionality Reduction;   
DOI  :  10.1186/s13036-022-00319-3
 received in 2022-11-13, accepted in 2022-12-12,  发布年份 2022
来源: Springer
PDF
【 摘 要 】

BackgroundIn the current genomic era, gene expression datasets have become one of the main tools utilized in cancer classification. Both curse of dimensionality and class imbalance problems are inherent characteristics of these datasets. These characteristics have a negative impact on the performance of most classifiers when used to classify cancer using genomic datasets.ResultsThis paper introduces Reduced Noise-Autoencoder (RN-Autoencoder) for pre-processing imbalanced genomic datasets for precise cancer classification. Firstly, RN-Autoencoder solves the curse of dimensionality problem by utilizing the autoencoder for feature reduction and hence generating new extracted data with lower dimensionality. In the next stage, RN-Autoencoder introduces the extracted data to the well-known Reduced Noise-Synthesis Minority Over Sampling Technique (RN- SMOTE) that efficiently solve the problem of class imbalance in the extracted data. RN-Autoencoder has been evaluated using different classifiers and various imbalanced datasets with different imbalance ratios. The results proved that the performance of the classifiers has been improved with RN-Autoencoder and outperformed the performance with original data and extracted data with percentages based on the classifier, dataset and evaluation metric. Also, the performance of RN-Autoencoder has been compared to the performance of the current state of the art and resulted in an increase up to 18.017, 19.183, 18.58 and 8.87% in terms of test accuracy using colon, leukemia, Diffuse Large B-Cell Lymphoma (DLBCL) and Wisconsin Diagnostic Breast Cancer (WDBC) datasets respectively.ConclusionRN-Autoencoder is a model for cancer classification using imbalanced gene expression datasets. It utilizes the autoencoder to reduce the high dimensionality of the gene expression datasets and then handles the class imbalance using RN-SMOTE. RN-Autoencoder has been evaluated using many different classifiers and many different imbalanced datasets. The performance of many classifiers has improved and some have succeeded in classifying cancer with 100% performance in terms of all used metrics. In addition, RN-Autoencoder outperformed many recent works using the same datasets.

【 授权许可】

CC BY   
© The Author(s) 2023

【 预 览 】
附件列表
Files Size Format View
RO202305116872422ZK.pdf 3292KB PDF download
41116_2022_35_Article_IEq485.gif 1KB Image download
Fig. 23 163KB Image download
41116_2022_35_Article_IEq491.gif 1KB Image download
41116_2022_35_Article_IEq500.gif 1KB Image download
41116_2022_35_Article_IEq506.gif 1KB Image download
Fig. 29 127KB Image download
41116_2022_35_Article_IEq514.gif 1KB Image download
Fig. 1 125KB Image download
41116_2022_35_Article_IEq524.gif 1KB Image download
Fig. 2 90KB Image download
41116_2022_35_Article_IEq528.gif 1KB Image download
41116_2022_35_Article_IEq533.gif 1KB Image download
41116_2022_35_Article_IEq535.gif 1KB Image download
41116_2022_35_Article_IEq537.gif 1KB Image download
41116_2022_35_Article_IEq538.gif 1KB Image download
41116_2022_35_Article_IEq539.gif 1KB Image download
Fig. 36 48KB Image download
41116_2022_35_Article_IEq541.gif 1KB Image download
Fig. 1 133KB Image download
41116_2022_35_Article_IEq543.gif 1KB Image download
41116_2022_35_Article_IEq544.gif 1KB Image download
41116_2022_35_Article_IEq545.gif 1KB Image download
Fig. 38 215KB Image download
41116_2022_35_Article_IEq547.gif 1KB Image download
41116_2022_35_Article_IEq548.gif 1KB Image download
Fig. 39 1799KB Image download
41116_2022_35_Article_IEq550.gif 1KB Image download
41116_2022_35_Article_IEq551.gif 1KB Image download
41116_2022_35_Article_IEq552.gif 1KB Image download
41116_2022_35_Article_IEq555.gif 1KB Image download
41116_2022_35_Article_IEq556.gif 1KB Image download
【 图 表 】

41116_2022_35_Article_IEq556.gif

41116_2022_35_Article_IEq555.gif

41116_2022_35_Article_IEq552.gif

41116_2022_35_Article_IEq551.gif

41116_2022_35_Article_IEq550.gif

Fig. 39

41116_2022_35_Article_IEq548.gif

41116_2022_35_Article_IEq547.gif

Fig. 38

41116_2022_35_Article_IEq545.gif

41116_2022_35_Article_IEq544.gif

41116_2022_35_Article_IEq543.gif

Fig. 1

41116_2022_35_Article_IEq541.gif

Fig. 36

41116_2022_35_Article_IEq539.gif

41116_2022_35_Article_IEq538.gif

41116_2022_35_Article_IEq537.gif

41116_2022_35_Article_IEq535.gif

41116_2022_35_Article_IEq533.gif

41116_2022_35_Article_IEq528.gif

Fig. 2

41116_2022_35_Article_IEq524.gif

Fig. 1

41116_2022_35_Article_IEq514.gif

Fig. 29

41116_2022_35_Article_IEq506.gif

41116_2022_35_Article_IEq500.gif

41116_2022_35_Article_IEq491.gif

Fig. 23

41116_2022_35_Article_IEq485.gif

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]
  • [56]
  • [57]
  • [58]
  • [59]
  • [60]
  • [61]
  • [62]
  • [63]
  • [64]
  • [65]
  • [66]
  • [67]
  • [68]
  • [69]
  • [70]
  • [71]
  文献评价指标  
  下载次数:37次 浏览次数:0次