期刊论文详细信息
Brazilian Computer Society. Journal
Combining instance selection and self-training to improve data stream quantification
article
André G. Maletzke1  Denis M. dos Reis1  Gustavo E. A. P. A. Batista1 
[1] Laboratório de Inteligência Computacional (LABIC), Instituto de Ciências Matemáticas e de Computação (ICMC), Universidade de São Paulo
关键词: Data stream;    Quantification;    Concept drift;   
DOI  :  10.1186/s13173-018-0076-0
来源: Springer U K
PDF
【 摘 要 】

In the last years, learning from data streams has attracted the attention of researchers and practitioners due to its large number of applications. These applications have motivated the research community to propose a significant amount of methods to solve problems in diverse tasks, more prominently in classification, clustering, and anomaly detection. However, a relevant task known as quantification has remained mostly unexplored. The quantification goal is to provide an estimate of the class prevalence in an unlabeled set. Recently, we proposed the SQSI algorithm to quantify data streams with concept drifts. SQSI uses a statistical test to identify concept drifts and retrain the classifiers. However, the retraining involves requiring the labels for all newly arrived instances. In this paper, we extend SQSI algorithm by exploring instance selection techniques allied to semi-supervised learning. The idea is to request the classes of a smaller subset of recent examples. Our experiments demonstrate that SQSI’s extension significantly reduces the dependency on actual labels while maintaining or improving the quantification accuracy.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO202106300002991ZK.pdf 1283KB PDF download
  文献评价指标  
  下载次数:9次 浏览次数:1次