学位论文详细信息
Boosting methods for variable selection in high dimensional sparse models
regression;high dimensional sparse models;variable selection;binary classification;boosting;lasso;elastic net;support vector machines;gene expression data
Hwang, Wook Yeon ; Hao Helen Zhang, Committee Member,Howard Bondell, Committee Member,Wenbin Lu, Committee Member,Subhashis Ghosal, Committee Chair,Hwang, Wook Yeon ; Hao Helen Zhang ; Committee Member ; Howard Bondell ; Committee Member ; Wenbin Lu ; Committee Member ; Subhashis Ghosal ; Committee Chair
University:North Carolina State University
关键词: regression;    high dimensional sparse models;    variable selection;    binary classification;    boosting;    lasso;    elastic net;    support vector machines;    gene expression data;   
Others  :  https://repository.lib.ncsu.edu/bitstream/handle/1840.16/4092/etd.pdf?sequence=1&isAllowed=y
美国|英语
来源: null
PDF
【 摘 要 】

Firstly, we propose new variable selection techniques for regression in high dimensionallinear models based on a forward selection version of the LASSO, adaptive LASSO orelastic net, respectively to be called as forward iterative regression and shrinkage technique(FIRST), adaptive FIRST and elastic FIRST. These methods seem to work better for anextremely sparse high dimensional linear regression model. We exploit the fact that theLASSO, adaptive LASSO and elastic net have closed form solutions when the predictor isone-dimensional. The explicit formula is then repeatedly used in an iterative fashion untilconvergence occurs. By carefully considering the relationship between estimators at successivestages, we develop fast algorithms to compute our estimators. The performance of ournew estimators is compared with commonly used estimators in terms of predictive accuracyand errors in variable selection. It is observed that our approach has better predictionperformance for highly sparse high dimensional linear regression models.Secondly, we propose a new variable selection technique for binary classificationin high dimensional models based on a forward selection version of the Squared SupportVector Machines or one-norm Support Vector Machines, to be called as forward iterativeselection and classification algorithm (FISCAL). This methods seem to work better for ahighly sparse high dimensional binary classification model. We suggest the squared support vector machines using 1-norm and 2-norm simultaneously. The squared support vectormachines are convex and differentiable except at zero when the predictor is one-dimensional.Then an iterative forward selection approach is applied along with the squared support vectormachines until a stopping rule is satisfied. Also, we develop a recursive algorithm forthe FISCAL to save computational burdens. We apply the processes to the original onenormSupport Vector Machines. We compare the FISCAL with other widely used binaryclassification approaches with regard to prediction performance and selection accuracy.The FISCAL shows competitive prediction performance for highly sparse high dimensional binary classification models.

【 预 览 】
附件列表
Files Size Format View
Boosting methods for variable selection in high dimensional sparse models 454KB PDF download
  文献评价指标  
  下载次数:17次 浏览次数:18次