期刊论文详细信息
PATTERN RECOGNITION 卷:120
Potential Anchoring for imbalanced data classification
Article
Koziarski, Michal1 
[1] AGH Univ Sci & Technol, Dept Elect, Mickiewicza 30, PL-30059 Krakow, Poland
关键词: Machine learning;    Classification;    Imbalanced data;    Oversampling;    Undersampling;    Radial basis functions;   
DOI  :  10.1016/j.patcog.2021.108114
来源: Elsevier
PDF
【 摘 要 】

Data imbalance remains one of the factors negatively affecting the performance of contemporary machine learning algorithms. One of the most common approaches to reducing the negative impact of data imbalance is preprocessing the original dataset with data-level strategies. In this paper we propose a unified framework for imbalanced data over-and undersampling. The proposed approach utilizes radial basis functions to preserve the original shape of the underlying class distributions during the resampling process. This is done by optimizing the positions of generated synthetic observations with respect to the proposed potential resemblance loss. The final Potential Anchoring algorithm combines over-and under sampling within the proposed framework. The results of the experiments conducted on 60 imbalanced datasets show outperformance of Potential Anchoring over state-of-the-art resampling algorithms, including previously proposed methods that utilize radial basis functions to model class potential. Furthermore, the results of the analysis based on the proposed data complexity index show that Potential Anchoring is particularly well suited for handling naturally complex (i.e. not affected by the presence of noise) datasets. (c) 2021 The Author. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/ )

【 授权许可】

Free   

【 预 览 】
附件列表
Files Size Format View
10_1016_j_patcog_2021_108114.pdf 3915KB PDF download
  文献评价指标  
  下载次数:0次 浏览次数:0次