会议论文详细信息
Joint Conference on Green Engineering Technology & Applied Computing 2019
Utilization of Filter Feature Selection with Support Vector Machine for Tumours Classification
工业技术(总论);计算机科学
Tengku Mazlin, T.A.H.^1 ; Sallehuddin, R.^1 ; Zuriahati, M.Y.^1
Faculty of Engineering, School of Computing, Universiti Teknologi Malaysia, Skudai, Johor Bahru
81310, Malaysia^1
关键词: Cancer classification;    Classification accuracy;    Classification performance;    Evaluation metrics;    High dimensional data;    Performance measurements;    Receiver operating characteristic curves;    Technology advancement;   
Others  :  https://iopscience.iop.org/article/10.1088/1757-899X/551/1/012062/pdf
DOI  :  10.1088/1757-899X/551/1/012062
来源: IOP
PDF
【 摘 要 】

Due to rapid technology advancement, machine learning has been widely used for solving cancer classification problem. Classification performance is highly depending on the quality of input features. With an explosive increase number of features of high dimensional data, the occurrence of ambiguous samples and data redundancy directly leads to poor classification accuracy. Therefore, this paper presents a utilization of filter feature selection using four filter methods such as Information Gain, Gain Ratio, Chi-Squared and Relief-F by performing attribute rankings to remove the irrelevant and redundant features and evaluate the significance and correlation of input data. Then, the classification will be performed using Support Vector Machine (SVM) to measure the accuracy performance based on the number of selected features. The performance measurement will be validated on standard Breast Cancer datasets consisting of 286 instances obtained from the UCI repository. Evaluation metrics such as accuracy, sensitivity, specificity and Area under Receiver Operating Characteristic Curve (AUC) will be used to assess the performance of the SVM classifier using four different filter methods. Experimental result shows that Gain ratio improves the accuracy of SVM classification compared to Information Gain, Chi-Squared and Relief-F in classifying breast cancer data with only small number of features selected.

【 预 览 】
附件列表
Files Size Format View
Utilization of Filter Feature Selection with Support Vector Machine for Tumours Classification 305KB PDF download
  文献评价指标  
  下载次数:30次 浏览次数:38次