期刊论文详细信息
BMC Genetics
Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test
Christoph Lange1  Elisabeth Mangold2  Kerstin U Ludwig4  Taofik AlChawa2  Deborah Blacker3  David M Swanson5 
[1]Institute for Genomic Mathematics
[2] German Center for Neurodegenerative Diseases, University of Bonn, Bonn, Germany
[3]Institute of Human Genetics, University of Bonn, Bonn, Germany
[4]Departments of Psychiatry and Epidemiology, Massachusetts General Hospital, Boston, Massachusetts
[5]Department of Genomics, Life and Brain Center, University of Bonn, Bonn, Germany
[6]Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts
关键词: Permutation tests;    Gene-based testing;    Eigenvector;    Dimension reduction;   
Others  :  1086376
DOI  :  10.1186/1471-2156-14-108
 received in 2013-06-11, accepted in 2013-10-18,  发布年份 2013
PDF
【 摘 要 】

Background

The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based.

Results

One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to “filter” redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate.

Conclusion

We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the summary-statistic based approach. We also implement the summary-statistic test using Z-statistics from an already-published GWAS of Chronic Obstructive Pulmonary Disorder (COPD) and correlation structure obtained from HapMap. We experiment with the modification of this test because the correlation structure is assumed imperfectly known.

【 授权许可】

   
2013 Swanson et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150116011406303.pdf 899KB PDF download
Figure 8. 36KB Image download
Figure 7. 39KB Image download
Figure 6. 18KB Image download
Figure 5. 22KB Image download
Figure 4. 22KB Image download
Figure 3. 34KB Image download
Figure 2. 24KB Image download
Figure 1. 38KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

【 参考文献 】
  • [1]Liu J, Mcrae A, Nyholt D, Medland S, Wray N, Brown K, Hayward N, Montgomery G, Visscher P, Martin N, et al.: A versatile gene-based test for genome-wide association studies. Am J Hum Genet 2010, 87:139-145.
  • [2]Gauderman W, Murcray C, Gilliland F, Conti D: Testing association between disease and multiple SNPs in a candidate gene. Genet Epidemiol 2007, 31(5):383-395. http://dx.doi.org/10.1002/gepi.20219 webcite
  • [3]Wang K, Abbott D: A principal components regression approach to multilocus genetic association studies. Genet Epidemiol 2008, 32(2):108-118. http://dx.doi.org/10.1002/gepi.20266 webcite
  • [4]Li M, Gui H, Kwan J, Sham P: GATES: a rapid and powerful gene-based association test using extended Simes procedure. Am J Hum Genet 2011, 88(3):283-293. http://dx.doi.org/10.1016/j.ajhg.2011.01.019 webcite
  • [5]Li M, Wang K, Grant S, Hakonarson H, Li C: ATOM: a powerful gene-based association test by combining optimally weighted markers. Bioinformatics (Oxford, England) 2009, 25(4):497-503. http://dx.doi.org/10.1093/bioinformatics/btn641 webcite
  • [6]Wang T, Elston R: Improved power by use of a weighted score test for linkage disequilibrium mapping. Am J Hum Genet 2007, 80(2):353-360. http://dx.doi.org/10.1086/511312 webcite
  • [7]Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira M, Bender D, Maller J, Sklar P, De Bakker P, Daly M, et al.: PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007, 81(3):559-575.
  • [8]Moskvina V, Schmidt K, Vedernikov A, Owen M, Craddock N, Holmans P, O’Donovan M: Permutation-based approaches do not adequately allow for linkage disequilibrium in gene-wide multi-locus association analysis. Eur J Hum Genet 2012. 20
  • [9]Yang J, Ferreira T, Morris A, Medland S, Replication DG, Madden P, Heath A, Martin N, Montgomery G, Weedon M, Loos R, Frayling T, Mark M, Hirschhorn J, Goddard M, Visscher P, of ANthropometric Traits (GIANT) Consortium GI: Conditional and joint multiple-SNP, analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nature Genet 2012, 44(4):369-75 S1. http://dx.doi.org/10.1038/ng.2213 webcite
  • [10]Pillai SG, Ge D, Zhu G, Kong X, Shianna KV, Need AC, Feng S, Hersh CP, Bakke P, Gulsvik A, et al.: A genome-wide association study in chronic obstructive pulmonary disease (COPD): identification of two major susceptibility loci. PLoS Genet 2009, 5(3):e1000421.
  • [11]Stefanski L, Carroll R: Score tests in generalized linear measurement error models. J R Stat Soc. Series B Methodological 1990, 345-359.
  • [12]Lagakos S: Effects of mismodelling and mismeasuring explanatory variables on tests of their association with a response variable. Stat Med 1988, 7(1-2):257-274. http://dx.doi.org/10.1002/sim.4780070126 webcite
  • [13]The International HapMap Consortium: The international HapMap project. Nature 2003, 426:789-796.
  • [14]Conneely KN, Boehnke M: So many correlated tests, so little time! rapid adjustment of P-values for multiple correlated tests. Am J Hum Genet 2007, 81(6):1158-1168.
  • [15]Imhof J: Computing the distribution of quadratic forms in normal variables. Biometrika 1961, 48(3/4):419-426.
  • [16]Mangold E, Ludwig KU, Birnbaum S, Baluardo C, Ferrian M, Herms S, Reutter H, de Assis NA, Al Chawa T, Mattheisen M, et al.: Genome-wide association study identifies two susceptibility loci for nonsyndromic cleft lip with or without cleft palate. Nat Genet 2009, 42:24-26.
  文献评价指标  
  下载次数:56次 浏览次数:37次