学位论文

【摘要】

The innovation of modern technologies drives research and development on high-dimensional data analysis in diverse fields, where variable selection plays a pivotal role to ensure credible model estimation. We focus on scalable algorithms for variable selection that can handle large data sets.Firstly, we propose an EM algorithm that returns the MAP estimate of the set of relevant variables. Due to its particular updating scheme,our algorithm can be implemented efficiently. We also show that the MAP estimate returned by our EM algorithm achieves variable selection consistency. In practice, EM algorithm tends to get stuck at local peaks. So we propose an ensemble version: repeatedly apply the EM algorithm on a subset of Bootstrap sample data and then aggregate the results. Empirical studies demonstrate the superior performance of this Bayesian Bootstrap EM algorithm. Secondly, we propose a hybrid computation framework for Bayesian variable selection. This new algorithm SAB is a combination of the classical EM algorithm and the variational Bayes algorithm. It is very fast in handling high dimensional data with a large number of covariates. To address a critical biological problem, we apply SAB to a state-of-art cancer genomics data set with a goal to understand the complex regulatory relationship between miRNAs and mRNAs in cancer. In the third part, we study the asymptotic behavior of the SAB algorithm in detail and prove that SAB achieves the selection consistency, Bayesian consistency and also an oracle property when the number of covariates grows with the sample size exponentially. Lastly, we extend the hybrid framework of Bayesian variable selection to logistic models, where we adopt the Polya-Gamma specification and show that this specification is equivalent as the local approximation method in the variational Bayes framework.

【预览】

附件列表
Files	Size	Format	View
Scalable algorithms for Bayesian variable selection	2004KB	PDF	download


Scalable algorithms for Bayesian variable selection
Variable Selection;EM;Ensemble;Variational Bayes;Asymptotic Analysis;Logistic model
Wang, Jin
关键词: Variable Selection; EM; Ensemble; Variational Bayes; Asymptotic Analysis; Logistic model;
Others : https://www.ideals.illinois.edu/bitstream/handle/2142/92827/WANG-DISSERTATION-2016.pdf?sequence=1&isAllowed=y
美国\|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF


	文献评价指标
	下载次数：27次	浏览次数：41次

【 摘 要 】

【 预 览 】

【摘要】

【预览】