学位论文详细信息
Statistical Inference and Computational Methods for Large High-Dimensional Data with Network Structure.
Network;Heterogeneous;High-dimernsional;Subsampling;Statistics and Numeric Data;Science;Statistics
Roy, SandipanZhu, Ji ;
University of Michigan
关键词: Network;    Heterogeneous;    High-dimernsional;    Subsampling;    Statistics and Numeric Data;    Science;    Statistics;   
Others  :  https://deepblue.lib.umich.edu/bitstream/handle/2027.42/113602/sandipan_1.pdf?sequence=1&isAllowed=y
瑞士|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

New technological advancements have allowed collection of datasets of large volume and different levels of complexity. Many of these datasets have an underlying network structure. Networks are capable of capturing dependence relationship among a group of entities and hence analyzing these datasets unearth the underlying structural dependence among the individuals. Examples include gene regulatory networks, understanding stock markets, protein-protein interaction within the cell, online social networks etc. The thesis addresses two important aspects of large high-dimensional data with network structure. The first one focuses on a high-dimensional data with network structure that evolves over time. Examples of such data sets include time course gene expression data, voting records of legislative bodies etc. The main task is to estimate the change-point as well as the network structures prior and post it. The network structures are obtained by penalized optimization method and we establish a finite sample estimation error bound for the change-point in the high-dimensional regime. The other aspect that we examine is about parameter estimation in large heterogeneous data with network structure. Our primary goal is to develop efficient computational techniques based on random subsampling and parallelization to estimate the parameters. We provide an analysis of rate of decay of bias and variance of our parallel implementation with a single round of communication after every iteration. We further show two applications of our methodology in the case of Gaussian Mixture Model (GMM) and Stochastic Block Model (SBM).The emphasis is placed on developing new theoretical techniques and computational tools for network problems and applying the corresponding methodology in many fields, including biomedical and social science research, where network modeling and analysis plays an exceedingly important role.

【 预 览 】
附件列表
Files Size Format View
Statistical Inference and Computational Methods for Large High-Dimensional Data with Network Structure. 9141KB PDF download
  文献评价指标  
  下载次数:8次 浏览次数:28次