期刊论文详细信息
Big Data and Cognitive Computing 卷:6
Combination of Reduction Detection Using TOPSIS for Gene Expression Data Analysis
Tapas Kumar Mishra1  Sambit Kumar Mishra1  Deepak Puthal2  Rasmita Dash3  Jogeswar Tripathy3  Binod Kumar Pattanayak3 
[1] Department of Computer Science and Engineering, SRM University-AP, Amaravati 522502, India;
[2] Department of Electrical Engineering and Computer Science, Khalifa University, Abu Dhabi 127788, United Arab Emirates;
[3] ITER, Siksha ‘O’ Anusandhan Deemed to be University, Bhubaneswar 751030, India;
关键词: feature selection;    machine learning;    microarray gene extraction;    pipelining;    TOPSIS;   
DOI  :  10.3390/bdcc6010024
来源: DOAJ
【 摘 要 】

In high-dimensional data analysis, Feature Selection (FS) is one of the most fundamental issues in machine learning and requires the attention of researchers. These datasets are characterized by huge space due to a high number of features, out of which only a few are significant for analysis. Thus, significant feature extraction is crucial. There are various techniques available for feature selection; among them, the filter techniques are significant in this community, as they can be used with any type of learning algorithm and drastically lower the running time of optimization algorithms and improve the performance of the model. Furthermore, the application of a filter approach depends on the characteristics of the dataset as well as on the machine learning model. Thus, to avoid these issues in this research, a combination of feature reduction (CFR) is considered designing a pipeline of filter approaches for high-dimensional microarray data classification. Considering four filter approaches, sixteen combinations of pipelines are generated. The feature subset is reduced in different levels, and ultimately, the significant feature set is evaluated. The pipelined filter techniques are Correlation-Based Feature Selection (CBFS), Chi-Square Test (CST), Information Gain (InG), and Relief Feature Selection (RFS), and the classification techniques are Decision Tree (DT), Logistic Regression (LR), Random Forest (RF), and k-Nearest Neighbor (k-NN). The performance of CFR depends highly on the datasets as well as on the classifiers. Thereafter, the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) method is used for ranking all reduction combinations and evaluating the superior filter combination among all.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:6次