学位论文详细信息
Statistical Methods for High Dimensional Networked Data Analysis.
high-dimensional networked data;association maps;latent factors;structural equation models;directed acyclic graph;hybrid quadratic inference functions;Public Health;Health Sciences;Biostatistics
Zhou, YanKretzler, Matthias ;
University of Michigan
关键词: high-dimensional networked data;    association maps;    latent factors;    structural equation models;    directed acyclic graph;    hybrid quadratic inference functions;    Public Health;    Health Sciences;    Biostatistics;   
Others  :  https://deepblue.lib.umich.edu/bitstream/handle/2027.42/111595/zhouyan_1.pdf?sequence=1&isAllowed=y
瑞士|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Networked data are frequently encountered in many scientific disciplines. One major challenges in the analysis of such data are its high dimensionality and complex dependence. My dissertation consists of three projects.The first project focuses on the development of sparse multivariate factor analysis regression model to construct the underlying sparse association map between gene expressions and biomarkers. This is motivated by the fact that some associations may be obscured by unknown confounding factors that are not collected in the data. I have shown that accounting for such unobserved confounding factors can increase both sensitivity and specificity for detecting important gene-biomarker associations and thus lead to more interpretable association maps.The second project concerns the reconstruction of the underlying gene regulatory network using directed acyclic graphical models. My project aims to reduce false discoveries by identifying and removing edges resulted from shared confounding factors. I propose sparse structural factor equation models, in which structural equation models are used to capture directed graphs while factor analysis models are used to account for potential latent factors. I have shown that the proposed method enables me to obtain a simpler and more interpretable topology of a gene regulatory network.The third project is devoted to the development of a new regression analysis methodology to analyze electroencephalogram (EEG) neuroimaging data that are correlated among electrodes within an EEG-net. To address analytic challenges pertaining to the integration of network topology into the analysis, I propose hybrid quadratic inference functions that utilize both prior and data-driven correlations among network nodes into statistical estimation and inference. The proposed method is conceptually simple and computationally fast and more importantly has appealing large-sample properties. In a real EEG data analysis I applied the proposed method to detect significant association of iron deficiency on event-related potential measured in two subregions, which was not found using the classical spatial ANOVA random-effects models.

【 预 览 】
附件列表
Files Size Format View
Statistical Methods for High Dimensional Networked Data Analysis. 2414KB PDF download
  文献评价指标  
  下载次数:2次 浏览次数:3次