The class of Gaussian copula regression models provides a unified modeling framework to accommodate various marginal distributions and flexible dependence structures. In the presence of missing data, the Expectation-Maximization (EM) algorithm plays a central role in parameter estimation. This classical method is greatly challenged by multilevel correlation, large dimension of model parameters, and misaligned missing data mechanism. This dissertation develops a series of new methodologies to enhance the effectiveness of the EM algorithm in dealing with complex correlated data analysis via a combination of new concepts, estimation approaches, and computing procedures.Project 1 is focus on the development of an EM algorithm in Gaussian copula regression models with missing values, in which univariate location-scale family distributions are utilized for marginal regression models and Gaussian copula for dependence. To improve the implementation of the EM algorithm, we establish an effective peeling procedure in the M-step to sequentially maximize the observed log-likelihood with respect to regression parameters and dependence parameters. In addition, the Louis formula is provided for the calculation of the Fisher information.Project 2 is a critical extension of Project 1, where the assumption of structured correlation structure is relaxed, so the resulting model and algorithm can be applied to deal with complex correlated data with missing values. The key new contribution in the extension concerns the development of EM algorithm for composite likelihood estimation in the presence of misaligned missing data. We propose the complete-case composite likelihood to handle both point-identifiable and partially identifiable parameters in the Gaussian copula regression model.Estimation of a partially identifiable correlation parameter is given by an estimated interval.Both estimation properties and algorithmic convergences are discussed.Motivated by an electroencephalography (EEG) data, Project 3 concerns the regression analysis of multilevel correlated data. We develop a class of parametric regression models using Gaussian copulas and implement the maximum likelihood estimation. The proposed model is very flexible; in the aspect of regression model, it can accommodate continuous outcomes, or outcomes of mixed types; and in the aspect of dependence, it can allow temporal, spatial, clustered, or combined dependence structures.
【 预 览 】
附件列表
Files
Size
Format
View
Copula Regression Models for the Analysis of Correlated Data with Missing Values.