This work discusses several aspects of estimation and inference for high-dimensional Gaussian graphical models and consists of two main parts. The first part considers network-based pathway enrichment analysis based on incomplete network information. Pathway enrichment analysis has become a key tool for biomedical researchers to gain insight into the underlying biology of differentially expressed genes, proteins and metabolites. We propose a constrained network estimation framework that combines network estimation based on cell- and condition-specific high-dimensional Omics data with interaction information from existing data bases. The resulting pathway topology information is subsequently used to provide a framework for simultaneous testing of differences in expression levels of pathway members, as well as their interactions. We study the asymptotic properties of the proposed network estimator and the test for pathway enrichment, and investigate its small sample performance in simulated experiments and illustrate it on two cancer data sets.The second part of the thesis is devoted to reconstructing multiple graphical models simultaneously from high-dimensional data. We develop methodology that jointly estimates multiple Gaussian graphical models, assuming that there exists prior information on how they are structurally related. The proposed method consists of two steps: in the first one, we employ neighborhood selection to obtain estimated edge sets of the graphs using a group lasso penalty. In the second step, we estimate the nonzero entries in the inverse covariance matrices by maximizing the corresponding Gaussian likelihood. We establish the consistency of the proposed method for sparse high-dimensional Gaussian graphical models and illustrate its performance using simulation experiments. An application to a climate data set is also discussed.
【 预 览 】
附件列表
Files
Size
Format
View
Estimation and Inference for High-Dimensional Gaussian Graphical Models with Structural Constraints.