Advances in exome sequencing and the development of exome genotyping arrays are enabling explorations of association between rare coding variants and complex traits using sequencing-based GWAS. However, the cost of sequencing remains high, optimal study design for sequencing-based association studies is an open question, powerful association methods and software to detect trait-associated rare and low-frequency variants are in great need. Containing 5% of information in human genome, chromosome X analysis has been largely neglected in routine GWAS analysis. In this dissertation, I focus on three topics:First, I describe a computationally efficient approach to re-construct gene-level association test statistics from single-variant summary statistics and their covariance matrices for single studies and meta-analyses. By simulation and real data examples, I evaluate our methods under the null, investigate scenarios when family samples have larger power than population samples, compare power of different types of gene-level tests under various trait-generating models, and demonstrate the usage of our methods and the C++ software, RAREMETAL, by meta-analyzing SardiNIA and HUNT data on lipids levels.Second, I describe a variance component approach and a series of gene-level tests for X-linked rare variants analysis. By simulations, I demonstrate that our methods are well controlled under the null. I evaluate power to detect an autosomal or X-linked gene of same effect size, and investigate the effect of sex ratio in a sample to power of detecting an X-linked gene. Finally I demonstrate usage of our method and the C++ software by analyzing various quantitative traits measured in the SardiNIA study and report detected X-linked variants and genes.Third, I describe a novel likelihood-based approach and the C++ software, RAREFY, to prioritize samples that are more likely to be carriers of trait-associated variants in a sample, with limited budget. I first describe the statistical method for small pedigrees and then describe an MCMC approach to make our method computationally feasible for large pedigrees. By simulations and real data analysis, I compare our approach with other methods in both trait-associated allele discovery power and association power, and demonstrate the usage of our method on pedigrees from the SardiNIA study.
【 预 览 】
附件列表
Files
Size
Format
View
Design and Association Methods for Next-generation Sequencing Studies for Quantitative Traits.