期刊论文详细信息
Statistical Analysis and Data Mining
Distance‐based analysis of variance: Approximate inference
Christopher Minas1  Giovanni Montana1 
[1] Department of Mathematics Imperial College London London UK
关键词: distance‐;    based inference;    Person type III approximation;    genomics;    MANOVA;    neuroimaging;   
DOI  :  10.1002/sam.11227
学科分类:社会科学、人文和艺术(综合)
来源: John Wiley & Sons, Inc.
PDF
【 摘 要 】

In several modern applications, ranging from genomics to neuroimaging, there is a need to compare measurements across different populations, such as those collected from samples of healthy and diseased individuals. The interest is in detecting a group effect, and typically many thousands or even millions of tests need to be performed simultaneously, as exemplified in genomics where single tests are applied for each gene across the genome. Traditional procedures, such as multivariate analysis of variance (MANOVA), are not suitable when dealing with nonvector‐valued data structures such as functional or graph‐structured observations. In this article, we discuss an existing distance‐based MANOVA‐like approach, the distance‐based F (DBF) test, for detecting such differences. The null sampling distribution of the DBF test statistic relies on the distribution of the measurements and the chosen distance measure, and is generally unavailable in closed form. In practice, Monte Carlo permutation methods are deployed which introduce errors in estimating small p‐values and increase familywise type I error rates when not using enough permutations. In this work, we propose an approximate distribution for the DBF test allowing inferences to be drawn without the need for costly permutations. This is achieved by approximating the permutation distribution that would be obtained by enumerating all permutations by the Pearson type III distribution using moment matching. The use of the Pearson type III distribution is motivated via empirical observations with real data. We provide evidence with real and simulated data that the resulting approximate null distribution of the DBF test is flexible enough to work well with a range of distance measures. Through extensive simulations involving different data types and distance measures, we provide evidence that the proposed methodology yields the same statistical power that would otherwise only be achievable if many millions of Monte Carlo permutations were performed..

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO201901231058669ZK.pdf 33KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:21次