学位论文详细信息
Regression via Clustering using Dirichlet Mixtures
Bayesian;clustering;Dirichlet mixtures
Kang, Changku ; Hao H. Zhang, Committee Member,Subhashis Ghosal, Committee Chair,John F. Monahan, Committee Member,Sujit K. Ghosh, Committee Member,Kang, Changku ; Hao H. Zhang ; Committee Member ; Subhashis Ghosal ; Committee Chair ; John F. Monahan ; Committee Member ; Sujit K. Ghosh ; Committee Member
University:North Carolina State University
关键词: Bayesian;    clustering;    Dirichlet mixtures;   
Others  :  https://repository.lib.ncsu.edu/bitstream/handle/1840.16/3822/etd.pdf?sequence=1&isAllowed=y
美国|英语
来源: null
PDF
【 摘 要 】

Regression analysis is a fundamental problem of statistics. When the regression function has an unknown form, parametric analysis is sometimes inappropriate. In such a situation, the regression function should be estimated by nonparametric methods. Often, the regressor variable is sampled from several different subpopulations and the regression function has different forms depending on the source. The labels of these source subpopulations are not observable. Although a nonparametrically specified regression function can capture the overall regression function, nonparametric regression estimates are usually dependent on the assumption of homoscedasticity of additive errors. If the underlying distribution of X has unknown clusters, then the usual assumption, the homoscedasity does not hold. In estimating the regression function, we propose the idea of first finding clusters in the regressor variables by the Dirichlet mixture to impute lost subpopulation labels. A standard regression method such as linear or polynomial regression then may be used within each cluster.Markov Chain Monte Carlo (MCMC) sampling method is used to find the clusters and for each sample the estimated regression functions can be obtained. We also apply our method to the large p, small n problem, where the number of variables p is much greater than the number of samples n. In several simulation experiments, our method is compared to other methods such as kernel and smoothing splines in the univariate case and GAM (generalized additive model) and MARS (Multivariate Adaptive Regression Splines) in the multivariate case. The consistency issue is discussed without explicit proof.

【 预 览 】
附件列表
Files Size Format View
Regression via Clustering using Dirichlet Mixtures 524KB PDF download
  文献评价指标  
  下载次数:6次 浏览次数:9次