学位论文详细信息
Some Contributions to High Dimensional Mixed Effects Logistic Regression Models
high dimensional statistics;mixed effects generalized linear model;Iterated Filtering algorithms;proximal gradient algorithms;non-convex convergence analysis;stochastic restricted eigenvalues condition;Statistics and Numeric Data;Science;Statistics
Guo, JunZhu, Ji ;
University of Michigan
关键词: high dimensional statistics;    mixed effects generalized linear model;    Iterated Filtering algorithms;    proximal gradient algorithms;    non-convex convergence analysis;    stochastic restricted eigenvalues condition;    Statistics and Numeric Data;    Science;    Statistics;   
Others  :  https://deepblue.lib.umich.edu/bitstream/handle/2027.42/145938/guojun_1.pdf?sequence=1&isAllowed=y
瑞士|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

High dimensional mixed-effects generalized linear models extend the generalized linear models (GLMs) by adding random effects to the linear predictors of the original high dimensional GLMs. The high dimensional mixed-effect logistic regression is a typical example. These models are useful in analyzing categorical or discrete data with a group structure. Inference for these models is challenging because of the intractable and generally non-convex negative log-likelihood function. In this dissertation, we propose and analyze four different algorithms to solve the high dimensional mixed-effects logistic regression model.The first two algorithms we develop are stochastic proximal gradient and second-order approximate algorithms, which are both proximal gradient-based algorithms. As the gradient of the loss function is intractable, the stochastic proximal gradient algorithm uses a Markov chain Monte Carlo technique to approximate the gradient, while the second order approximate algorithm approximates the objective function based on Taylor expansion to the second order, and solves an approximate problem. We prove the convergence of the second order approximate algorithm using the Kurdyka-Lojasiewicz (K-L) property based techniques. To analyze convergence behavior of the stochastic proximal gradient algorithm, we expand this K-L based technique to incorporate stochastic perturbations in the algorithm updates. We show that the stochastic algorithm;;s limiting points are the stationary points of the original objective function. We illustrate the good performance of our algorithms in several numerical examples. We also apply the two algorithms in a breast cancer data analysis.The next algorithm we consider is based on a ``fixed effect approximation;;;; of the mixed effects models. Here we treat the random effects as unknown fixed effects coefficients and estimate them without penalty. The approximation reduces the original problem to the usual high dimensional logistic regression with offset terms. Computational efficiency is a clear gain, the non-convex problem is also replaced by a convex one. We have derived a non-asymptotic estimation error bound for its solution with respect to the true model parameters. In this effort, we have expanded the restricted eigenvalue (RE) condition to a stochastic setting, which holds with high probability in our problem. We have conducted an extensive numerical study of this approximation scheme, and compared its performance with the previous two algorithms. The same breast cancer data is analyzed by this algorithm.Our final algorithms are the iterated filtering algorithms. The core of this algorithm is a novel ``pseudo proximal map;;;; which computes the mean of a constructed log-likelihood function to approximate the optimum of the objective function. We explore its connections to the proximal and gradient descent algorithms and focus on its application in composite objective function optimization. We then devise the iterated filtering algorithm and its block coordinate update version to solve the high dimensional mixed-effect logistic regression model. Under strong convexity assumption, we derive new convergence results for the algorithm sharper than previous results in the literature. We use numerical studies to demonstrate the effectiveness of our algorithm.

【 预 览 】
附件列表
Files Size Format View
Some Contributions to High Dimensional Mixed Effects Logistic Regression Models 1011KB PDF download
  文献评价指标  
  下载次数:32次 浏览次数:84次