期刊论文

【摘要】

In the last years, learning from data streams has attracted the attention of researchers and practitioners due to its large number of applications. These applications have motivated the research community to propose a significant amount of methods to solve problems in diverse tasks, more prominently in classification, clustering, and anomaly detection. However, a relevant task known as quantification has remained mostly unexplored. The quantification goal is to provide an estimate of the class prevalence in an unlabeled set. Recently, we proposed the SQSI algorithm to quantify data streams with concept drifts. SQSI uses a statistical test to identify concept drifts and retrain the classifiers. However, the retraining involves requiring the labels for all newly arrived instances. In this paper, we extend SQSI algorithm by exploring instance selection techniques allied to semi-supervised learning. The idea is to request the classes of a smaller subset of recent examples. Our experiments demonstrate that SQSI’s extension significantly reduces the dependency on actual labels while maintaining or improving the quantification accuracy.

【授权许可】

Unknown

【预览】

附件列表
Files	Size	Format	View
RO202106300002991ZK.pdf	1283KB	PDF	download

Brazilian Computer Society. Journal
Combining instance selection and self-training to improve data stream quantification
article
André G. Maletzke¹ Denis M. dos Reis¹ Gustavo E. A. P. A. Batista¹
[1] Laboratório de Inteligência Computacional (LABIC), Instituto de Ciências Matemáticas e de Computação (ICMC), Universidade de São Paulo
关键词: Data stream; Quantification; Concept drift;
DOI : 10.1186/s13173-018-0076-0
来源: Springer U K
PDF


	文献评价指标
	下载次数：9次	浏览次数：1次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】