2017 3rd International Conference on Environmental Science and Material Application | |
K-Nearest Neighbor Algorithm Optimization in Text Categorization | |
生态环境科学;材料科学 | |
Chen, Shufeng^1 | |
Research Institute of Electronic Science and Technology, University of Electronic Science and Technology of China, Chengdu | |
611730, China^1 | |
关键词: Classification algorithm; K nearest neighbor (KNN); K nearest neighbor algorithm; K-nearest neighbors; Representative sample; Similarity calculation; Strong dependences; Text categorization; | |
Others : https://iopscience.iop.org/article/10.1088/1755-1315/108/5/052074/pdf DOI : 10.1088/1755-1315/108/5/052074 |
|
来源: IOP | |
【 摘 要 】
K-Nearest Neighbor (KNN) classification algorithm is one of the simplest methods of data mining. It has been widely used in classification, regression and pattern recognition. The traditional KNN method has some shortcomings such as large amount of sample computation and strong dependence on the sample library capacity. In this paper, a method of representative sample optimization based on CURE algorithm is proposed. On the basis of this, presenting a quick algorithm QKNN (Quick k-nearest neighbor) to find the nearest k neighbor samples, which greatly reduces the similarity calculation. The experimental results show that this algorithm can effectively reduce the number of samples and speed up the search for the k nearest neighbor samples to improve the performance of the algorithm.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
K-Nearest Neighbor Algorithm Optimization in Text Categorization | 137KB | download |