In this dissertation we develop theory for inference and uncertainty quantification for potentially misspecified quantile regression processes when the number of predictor variables increases with or exceeds the sample size. Potential misspecification of the fitted model is a fundamental problem in statistics which is exacerbated by today;;s high-dimensional datasets, and quantile regression is often used in complex situations in which misspecifications are highly likely. We make the following contributions: First, we establish a uniform-in-model strong Bahadur representation for misspecified quantile regression processes when the number of predictor variables increases and provide tight error bounds on its remainder term which hold uniformly over growing collections of quantile regression functions. Second, we derive an almost sure de-biased representation of the Lasso-penalized high-dimensional misspecified quantile regression process and analyze the theoretical properties of the misspecified post-Lasso quantile regression estimator. Third, to quantify the uncertainty associated with a misspecified quantile regression function we analyze its predictive risk and expected optimism. We propose uniformly consistent estimators for both quantities when the number of regression functions is growing moderately with the sample size. Empirical evidence shows that our estimators perform favorably against cross-validation estimates. Forth, we develop a set of new exponential and maximal inequalities which allow to control the fluctuations of a collection of suprema of empirical processes over classes of unbounded functions when both the collection of function classes and the complexity of each individual the function class grow with the sample size. These new inequalities are instrumental in deriving the theoretical results in this dissertation.
【 预 览 】
附件列表
Files
Size
Format
View
On High-Dimensional Misspecified Quantile Regression