NEUROCOMPUTING | 卷:160 |
Hubness-aware kNN classification of high-dimensional data in presence of label noise | |
Article | |
Tomasev, Nenad1  Buza, Krisztian2  | |
[1] Jozef Stefan Inst, Artificial Intelligence Lab, Ljubljana 1000, Slovenia | |
[2] Semmelweis Univ, Inst Genom Med & Rare Disorders, H-1083 Budapest, Hungary | |
关键词: Classification; Label noise; K-nearest neighbor; High-dimensional data; Hubness; Neighbor occurrence models; | |
DOI : 10.1016/j.neucom.2014.10.084 | |
来源: Elsevier | |
【 摘 要 】
Learning with label noise is an important issue in classification, since it is not always possible to obtain reliable data labels. In this paper we explore and evaluate a new approach to learning with label noise in intrinsically high-dimensional data, based on using neighbor occurrence models for hubness-aware k-nearest neighbor classification. Hubness is an important aspect of the curse of dimensionality that has a negative effect on many types of similarity-based learning methods. As we will show, the emergence of hubs as centers of influence in high-dimensional data affects the learning process in the presence of label noise. We evaluate the potential impact of hub-centered noise by defining a hubness-proportional random label noise model that is shown to induce a significantly higher kNN misclassification rate than the uniform random label noise. Real-world examples are discussed where hubness-correlated noise arises either naturally or as a consequence of an adversarial attack. Our experimental evaluation reveals that hubness-based fuzzy k-nearest neighbor classification and Naive Hubness-Bayesian k-nearest neighbor classification might be suitable for learning under label noise in intrinsically high-dimensional data, as they exhibit robustness to high levels of random label noise and hubness-proportional random label noise. The results demonstrate promising performance across several data domains. (C) 2015 Elsevier B.V. All rights reserved.
【 授权许可】
Free
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
10_1016_j_neucom_2014_10_084.pdf | 2385KB | download |