学位论文详细信息
Statistical Methods and Analysis in Genome Wide Association Studies and Next- Generation Sequencing.
Genome-wide Association Study;Age-related Macular Degeneration;Genotype Calling and Haplotype Inference;State Space Reduction Method;Next-generation Sequencing;Genetics;Statistics and Numeric Data;Health Sciences;Science;Biostatistics
Chen, WeiSwaroop, Anand ;
University of Michigan
关键词: Genome-wide Association Study;    Age-related Macular Degeneration;    Genotype Calling and Haplotype Inference;    State Space Reduction Method;    Next-generation Sequencing;    Genetics;    Statistics and Numeric Data;    Health Sciences;    Science;    Biostatistics;   
Others  :  https://deepblue.lib.umich.edu/bitstream/handle/2027.42/89741/weich_1.pdf?sequence=1&isAllowed=y
瑞士|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】
Genome-wide association studies (GWAS), which examine common genetic variants in thousands of individuals, have identified many genetic loci associated with a variety of complex diseases and phenotypes. New Next-Generation Sequencing (NGS) technologies allow us to extend these studies to rarer variants not typically evaluated by GWAS. In this dissertation, I present novel statistical methods and software to dissect the genetic basis of complex traits in the context of both GWAS and NGS. First, I present a large-scale GWAS for Age-related Macular Degeneration (AMD). Our studies extend the catalog of AMD associated loci and provide clues about underlying cellular pathways. A novelty in our study is that we propose a prediction method using all susceptibility loci to help identify individuals at high risk of disease. The prediction can be extended to the general population with a weighted scheme combining both disease prevalence and case-control ratio in GWAS sample.Second, I describe an interactive package that provides graphical overviews of the results of whole-genome association studies in datasets with rich multi-dimensional phenotypic information, such as global surveys of gene expression. Third, I propose and implement an efficient Hidden Markov Model (HMM) based method for genotype calling and haplotype inference in parent-offspring trios. Our method considers both linkage disequilibrium (LD) patterns and the constraints imposed by the family structure in assigning individual genotypes and haplotypes. Using simulations and sequencing data from ongoing projects, we show that trios provide higher genotype calling accuracy across the frequency spectrum, both overall and at hard-to-call heterozygous sites. In addition, sequencing trios provides greatly improved haplotype phasing accuracy.Finally, I describe an efficient state space reduction method for haplotype inference and genotype calling. This method is motivated by the increasing computational challenge of HMM-based approaches used to describe haplotype sharing in GWAS and NGS data. Our method takes advantage of local similarity between haplotypes and reduces the HMM state space dynamically, while preserving the same accuracy of full state space method. Through simulation and real data analysis, we show that this method can have substantial savings in both memory and CPU time.
【 预 览 】
附件列表
Files Size Format View
Statistical Methods and Analysis in Genome Wide Association Studies and Next- Generation Sequencing. 2443KB PDF download
  文献评价指标  
  下载次数:22次 浏览次数:57次