Genetic Analysis Workshop 17 | |
PROCEEDINGS Open Access Detection of associations with rare and common SNPs for quantitative traits: a nonparametric Bayes-based approach | |
生物科学;医药卫生 | |
Lili Ding1 ; 2* ; Tesfaye M Baye2 ; 3 ; Hua He4 ; Xue Zhang4 ; Brad G Kurowski2 ; 5 ; Lisa J Martin1 ; 2 ; 4 | |
Others : http://www.biomedcentral.com/content/pdf/1753-6561-5-S9-S10.pdf PID : 42410 |
|
来源: CEUR | |
【 摘 要 】
We propose a nonparametric Bayes-based clustering algorithm to detect associations with rare and common single-nucleotide polymorphisms (SNPs) for quantitative traits. Unlike current methods, our approach identifies associations with rare genetic variants at the variant level, not the gene level. In this method, we use a Dirichlet process prior for the distribution of SNP-specific regression coefficients, conduct hierarchical clustering with a distance measure derived from posterior pairwise probabilities of two SNPs having the same regression coefficient, and explore data-driven approaches to select the number of clusters. SNPs falling inside the largest cluster have relatively low or close to zero estimates of regression coefficients and are considered not associated with the trait. SNPs falling outside the largest cluster have relatively high estimates of regression coefficients and are considered potential risk variants. Using the data from the Genetic Analysis Workshop 17, we successfully detected associations with both rare and common SNPs for a quantitative trait. We conclude that our method provides a novel and broadly applicable strategy for obtaining association results with a reasonably low proportion of false discovery and that it can be routinely used in resequencing studies. Background The two highly debated hypotheses on the genetic basis of complex human diseases are the common disease/ common variant (CDCV) hypothesis and the common disease/rare variant (CDRV) hypothesis [1]. The CDCV hypothesis states that common diseases are caused by common variants (minor allele frequencies [MAF] > 5%) with small to modest effects. The CDRV hypothesis, on the other hand, argues that common diseases are caused by multiple rare variants (MAF < 5%), each with moder- ate to high penetrance. Although both common and rare variants likely play a role in complex human diseases, most statistical strategies for association analysis have been developed under the CDCV assumption, except recent work by Li and Leal [2] and Han and Pan [3]. A key strategy for association analysis with rare variants is to study the cumulative effect of multiple rare variants within the same gene or linkage disequilibrium block [2,4,5]. However, these methods identify genetic risk factors at the gene level, not the variant level. We pro- pose a nonparametric Bayes-based approach to detect associations with both rare and common genetic variants for quantitative traits. This approach clusters single- nucleotide polymorphisms (SNPs) according to the mag- nitude of SNP-specific regression coefficients. SNPs clus- tered together could come from different linkage disequilibrium blocks, genes, or even different chromo- somes and could have quite different MAFs. Methods Suppose that for each individual i (i = 1, 2, …, n) we observe yi, a quantitative trait; zi, a p-dimensional vector of individual-specific covariates, such as age and sex; and xi = (xi1, xi2, …, xiJ) , genotypes at J
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
PROCEEDINGS Open Access Detection of associations with rare and common SNPs for quantitative traits: a nonparametric Bayes-based approach | 359KB | download |