学位论文

【摘要】

At the onset of the "Big Data" age, we are faced with ubiquitous data in various forms and with various characteristics, such as noise, high dimensionality, autocorrelation, and so on. The question of how to obtainaccurate and computationally efficient estimates from such data is one that has stoked the interest of many researchers. This dissertation mainly concentrates on two general problem areas: inference for high-dimensional and noisy data, and estimation of the steady-state mean for univariate data generated by computer simulation experiments. We develop and evaluate three separate sequential algorithms for the two topics. One majoradvantage of sequential algorithms is that they allow for careful experimental adjustments as sampling proceeds. Unlike one-step sampling plans, sequential algorithms adapt to different situations arising from the ongoing sampling; this makes these procedures efficacious as problems become more complicated and more-delicate requirements need to besatisfied. We will elaborate on each research topic in the following discussion. Concerning the first topic, our goal is to develop a robust graphical model for noisy data in a high-dimensional setting. Under a Gaussian distributional assumption, the estimation of undirected Gaussian graphs is equivalent to the estimation of inverse covariance matrices. Particular interest has focused upon estimating a sparse inverse covariance matrix to reveal insight on the data as suggested by the principle of parsimony. Forestimation with high-dimensional data, the influence of anomalous observations becomes severe as the dimensionality increases. To address this problem, we propose a robust estimation procedure for the Gaussian graphical model based on the Integrated Squared Error (ISE) criterion. The robustness result is obtained by using ISE as a nonparametric criterion for seeking the largest portion of the data that "matches" the model. Moreover, an l₁-type regularization is applied to encourage sparseestimation. To address the non-convexity of the objective function, we develop a sequential algorithm in the spirit of amajorization-minimization scheme. We summarize the results of Monte Carloexperiments supporting the conclusion that our estimator of the inverse covariance matrix converges weakly (i.e., in probability) to the latter matrix as the sample size grows large. The performance of the proposed method is compared with that of several existing approaches through numerical simulations. We further demonstrate the strength of our method with applications in genetic network inference and financial portfolio optimization. The second topic consists of two parts, and both concern the computation of point and confidence interval (CI) estimators for the mean µ of astationary discrete-time univariate stochastic process X \equiv \{X_i: i=1,2,...} generated by a simulation experiment. The point estimation is relatively easy when the underlying system starts in steady state; but the traditional way of calculating CIs usually fails since the data encountered in simulation output are typically serially correlated. We propose two distinct sequential procedures that each yield a CI for µ with user-specified reliability and absolute or relative precision. The first sequential procedure is based on variance estimators computed from standardized time series applied to nonoverlapping batches ofobservations, and it is characterized by its simplicity relative to methods based on batch means and its ability to deliver CIs for thevariance parameter of the output process (i.e., the sum of covariances at all lags). The second procedure is the first sequential algorithm that uses overlapping variance estimators to construct asymptotically valid CI estimators for the steady-state mean based on standardized time series. The advantage of this procedure is that compared with other popular procedures for steady-state simulation analysis, the second procedure yields significant reduction both in the variability of its CI estimator and in the sample size needed to satisfy the precision requirement. The effectiveness of both procedures is evaluated via comparisons withstate-of-the-art methods based on batch means under a series of experimental settings: the M/M/1 waiting-time process with 90% traffic intensity; the M/H_2/1 waiting-time process with 80% trafficintensity; the M/M/1/LIFO waiting-time process with 80% traffic intensity; and an AR(1)-to-Pareto (ARTOP) process. We find that the new procedures perform comparatively well in terms of their averagerequired sample sizes as well as the coverage and average half-length oftheir delivered CIs.

【预览】

附件列表
Files	Size	Format	View
Sequential estimation in statistics and steady-state simulation	1504KB	PDF	download


Sequential estimation in statistics and steady-state simulation
Sequential algorithm;Standardized time series;Steady-state simulation
Tang, Peng ; Alexopoulos, Christos Goldsman, David M. Industrial and Systems Engineering Vengazhiyil, Roshan Joseph Shi, Jianjun Deng, Xinwei ; Alexopoulos, Christos
University:Georgia Institute of Technology
Department:Industrial and Systems Engineering
关键词: Sequential algorithm; Standardized time series; Steady-state simulation;
Others : https://smartech.gatech.edu/bitstream/1853/51857/1/TANG-DISSERTATION-2014.pdf
美国\|英语
来源: SMARTech Repository
PDF


	文献评价指标
	下载次数：13次	浏览次数：7次

【 摘 要 】

【 预 览 】

【摘要】

【预览】