14th International Conference on Science, Engineering and Technology | |
Sentiment analysis of feature ranking methods for classification accuracy | |
自然科学;工业技术 | |
Joseph, Shashank^1 ; Mugauri, Calvin^1 ; Sumathy, S.^1 | |
School of Information Technology and Engineering, VIT University, Vellore | |
632014, India^1 | |
关键词: Classification accuracy; datasets; Document frequency; Feature ranking; Log likelihood ratio; Pre-processing; Text classification; Text preprocessing; | |
Others : https://iopscience.iop.org/article/10.1088/1757-899X/263/4/042011/pdf DOI : 10.1088/1757-899X/263/4/042011 |
|
来源: IOP | |
【 摘 要 】
Text pre-processing and feature selection are important and critical steps in text mining. Text pre-processing of large volumes of datasets is a difficult task as unstructured raw data is converted into structured format. Traditional methods of processing and weighing took much time and were less accurate. To overcome this challenge, feature ranking techniques have been devised. A feature set from text preprocessing is fed as input for feature selection. Feature selection helps improve text classification accuracy. Of the three feature selection categories available, the filter category will be the focus. Five feature ranking methods namely: document frequency, standard deviation information gain, CHI-SQUARE, and weighted-log likelihood -ratio is analyzed.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
Sentiment analysis of feature ranking methods for classification accuracy | 327KB | download |