期刊论文

【摘要】

In common binary classification scenarios, the presence of both positive and negative examples in training data is needed to build an efficient classifier. Unfortunately, in many domains, this requirement is not satisfied and only one class of examples is available. To cope with this setting, classification algorithms have been introduced that learn from Positive and Unlabeled (PU) data. Originally, these approaches were exploited in the context of document classification. Only few works address the PU problem for categorical datasets. Nevertheless, the available algorithms are mainly based on Naive Bayes classifiers. In this work we present a new distance based PU learning approach for categorical data: Pulce. Our framework takes advantage of the intrinsic relationships between attribute values and exceeds the independence assumption made by Naive Bayes. Pulce, in fact, leverages on the statistical properties of the data to learn a distance metric employed during the classification task. We extensively validate our approach over real world datasets and demonstrate that our strategy obtains statistically significant improvements w.r.t state-of-the-art competitors. (C) 2016 Elsevier B.V. All rights reserved.

【授权许可】

Free

【预览】

附件列表
Files	Size	Format	View
10_1016_j_neucom_2016_01_089.pdf	1238KB	PDF	download

NEUROCOMPUTING	卷:196
Positive and unlabeled learning in categorical data
Article
Ienco, Dino^1,3 Pensa, Ruggero G.²
[1] IRSTEA Montpellier, UMR TETIS, F-34093 Montpellier, France
[2] Univ Turin, Dept Comp Sci, I-10149 Turin, Italy
[3] LIRMM Montpellier, ADVANSE, F-34090 Montpellier, France
关键词: Positive unlabeled learning; Partially supervised learning; Distance learning; Categorical data;
DOI : 10.1016/j.neucom.2016.01.089
来源: Elsevier
PDF


	文献评价指标
	下载次数：4次	浏览次数：0次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】