期刊论文详细信息
PATTERN RECOGNITION 卷:40
A semi-supervised regression model for mixed numerical and categorical variables
Article
Ng, Michael K. ; Chan, Elaine Y. ; So, Meko M. C. ; Ching, Wai-Ki
关键词: clustering;    regression;    data mining;    numerical variables;    categorical variables;   
DOI  :  10.1016/j.patcog.2006.06.018
来源: Elsevier
PDF
【 摘 要 】

In this paper, we develop a semi-supervised regression algorithm to analyze data sets which contain both categorical and numerical attributes. This algorithm partitions the data sets into several clusters and at the same time fits a multivariate regression model to each cluster. This framework allows one to incorporate both multivariate regression models for numerical variables (supervised learning methods) and k-mode clustering algorithms for categorical variables (unsupervised learning methods). The estimates of regression models and k-mode parameters can be obtained simultaneously by minimizing a function which is the weighted SLIM of the least-square errors in the multivariate regression models and the dissimilarity measures among the categorical variables. Both synthetic and real data sets are presented to demonstrate the effectiveness of the proposed method. (c) 2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

【 授权许可】

Free   

【 预 览 】
附件列表
Files Size Format View
10_1016_j_patcog_2006_06_018.pdf 207KB PDF download
  文献评价指标  
  下载次数:4次 浏览次数:0次