期刊论文

【摘要】

Functional data analysis has demonstrated significant success in time series analysis. In recent biomedical research, it has also been used to analyze sequence variations in genome-wide association studies (GWAS). The observations of genetic variants, called single-nucleotide polymorphisms (SNPs), of an individual are distributed over the loci of a DNA sequence. Thus, it can be regarded as a realization of a stochastic process, which is no different from a time series. However, SNPs are usually coded as the number of minor alleles, which are categorical. The usual least-square smoothing in FDA only works well when the data is continuous and normally distributed. The normality assumption will be violated for categorical SNP data. In this work, we propose a two-step method for smoothing categorical SNPs using a novel method and constructing haplotypes having strong associations with the disease using functional generalized linear models. We show its effectiveness through a real-world PennCATH dataset.

【授权许可】

Unknown

【预览】

附件列表
Files	Size	Format	View
RO202307010005411ZK.pdf	1081KB	PDF	download

Engineering Proceedings
Statistical Haplotypes Based on Functional Sequence Data Analysis for Genome-Wide Association Studies
article
Pei-Yun Sun¹ Guoqi Qian¹
[1] School of Mathematics and Statistics, University of Melbourne
关键词: stochastic process; functional data analysis; genome-wide association study; epistasis; haplotype; variable selection;
DOI : 10.3390/engproc2023039029
来源: mdpi
PDF


	文献评价指标
	下载次数：6次	浏览次数：1次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】