学位论文

【摘要】

Given multiple correlated data sets, an important question is how to make use of them to benefit later statistical inference. This is a realistic setting in the modern world as more and more related data sets are collected, say images and their descriptions, articles in multiple languages, actors in multiple social networks; and real data are often multivariate or high-dimensional such that dimension reduction is necessary before any inference. In this dissertation, I consider three dimension reduction and matching methods, namely principal component analysis followed by Procrustes matching, canonical correlation analysis, and nonlinear matching using shortest-path distance and joint neighborhood. I investigate their theoretical properties and their impact on later inference using the Procrustes fitting error, classification error, and hypothesis testing respectively. The main conclusion of this dissertation is that given a particular inference task for multiple correlated data sets, we may significantly improve the inference performance by joint matching and projection, compared to separate projection or omitting modalities. Numerical experiments are provided to illustrate the theorems and the methodology using simulated data and real data.

【预览】

附件列表
Files	Size	Format	View
Matching and Inference for Multiple Correlated Data Sets	1215KB	PDF	download


Matching and Inference for Multiple Correlated Data Sets
Dimension reduction;Machine learning;Data matching;Statistical inference;Applied Mathematics & Statistics
Shen, CenchengFishkind, Donniell E. ;
Johns Hopkins University
关键词: Dimension reduction; Machine learning; Data matching; Statistical inference; Applied Mathematics & Statistics;
Others : https://jscholarship.library.jhu.edu/bitstream/handle/1774.2/39366/SHEN-DISSERTATION-2015.pdf?sequence=1&isAllowed=y
瑞士\|英语
来源: JOHNS HOPKINS DSpace Repository
PDF


	文献评价指标
	下载次数：37次	浏览次数：16次

【 摘 要 】

【 预 览 】

【摘要】

【预览】