期刊论文详细信息
Open Physics
Semi-supervised Classification Based Mixed Sampling for Imbalanced Data
Liu Ning1  Zhao Jianhua2 
[1] College of Economics Management, Shangluo University, Shangluo726000, China;College of Mathematics and Computer Application, Shangluo University, Shangluo726000, China;
关键词: semi-supervised learning;    imbalanced data;    over sampling;    under sampling;    ensemble learning;    89.20.ff;    89.75.kd;    89.70.cf;   
DOI  :  10.1515/phys-2019-0103
来源: DOAJ
【 摘 要 】

In practical application, there are a large amount of imbalanced data containing only a small number of labeled data. In order to improve the classification performance of this kind of problem, this paper proposes a semi-supervised learning algorithm based on mixed sampling for imbalanced data classification (S2MAID), which combines semi-supervised learning, over sampling, under sampling and ensemble learning. Firstly, a kind of under sampling algorithm UD-density is provided to select samples with high information content from majority class set for semi-supervised learning. Secondly, a safe supervised-learning method is used to mark unlabeled sample and expand the labeled sample. Thirdly, a kind of over sampling algorithm SMOTE-density is provided to make the imbalanced data set become balance set. Fourthly, an ensemble technology is used to generate a strong classifier. Finally, the experiment is carried out on imbalanced data with containing only a few labeled samples, and semi-supervised learning process is simulated. The proposed S2MAID is verified and the experimental result shows that the proposed S2MAID has a better classification performance.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次