期刊论文详细信息
Statistical Analysis and Data Mining
Model selection procedure for high‐dimensional data
Yongli Zhang1  Xiaotong Shen2 
[1] Lundquist College of Business, University of Oregon, 1208 University Ave, Eugene, OR 97403, USA;School of Statistics, University of Minnesota, 224 Church Street S.E., Minneapolis, MN 55455, USA
关键词: model selection;    information criterion;    large p but small n;    RIC;    power market;   
DOI  :  10.1002/sam.10088
学科分类:社会科学、人文和艺术(综合)
来源: John Wiley & Sons, Inc.
PDF
【 摘 要 】

For high‐dimensional regression, the number of predictors may greatly exceed the sample size but only a small fraction of them are related to the response. Therefore, variable selection is inevitable, where consistent model selection is the primary concern. However, conventional consistent model selection criteria like Bayesian information criterion (BIC) may be inadequate due to their nonadaptivity to the model space and infeasibility of exhaustive search. To address these two issues, we establish a probability lower bound of selecting the smallest true model by an information criterion, based on which we propose a model selection criterion, what we call RICc, which adapts to the model space. Furthermore, we develop a computationally feasible method combining the computational power of least angle regression (LAR) with that of RICc. Both theoretical and simulation studies show that this method identifies the smallest true model with probability converging to one if the smallest true model is selected by LAR. The proposed method is applied to real data from the power market and outperforms the backward variable selection in terms of price forecasting accuracy. .

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO201904041581868ZK.pdf 59KB PDF download
  文献评价指标  
  下载次数:8次 浏览次数:6次