BMC Bioinformatics | |
A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data | |
Hua Zhou1  Jin Zhou2  Yiwen Liu3  Joseph Watkins4  Miao Zhang5  | |
[1] Department of Biostatistics, University of California, Los Angeles, 650 Charles E. Young Dr. South, 90095, Los Angeles, USA;Department of Epidemiology and Biostatistics, University of Arizona, 1295 N. Martin Ave., 85724, Tucson, USA;Interdisciplinary Program in Statistics and Data Science, University of Arizona, 617 N. Santa Rita Ave., 85721, Tucson, USA;Department of Medicine, UCLA David Geffen School of Medicine, Los Angeles, CA, USA;Department of Mathematics, University of Arizona, 617 N. Santa Rita Ave., 85721, Tucson, USA;Department of Mathematics, University of Arizona, 617 N. Santa Rita Ave., 85721, Tucson, USA;Interdisciplinary Program in Statistics and Data Science, University of Arizona, 617 N. Santa Rita Ave., 85721, Tucson, USA;Interdisciplinary Program in Statistics and Data Science, University of Arizona, 617 N. Santa Rita Ave., 85721, Tucson, USA; | |
关键词: Dimension reduction; Non-linear kernel; Low-coverage; Population structure; Data-adaptive; | |
DOI : 10.1186/s12859-021-04265-7 | |
来源: Springer | |
【 摘 要 】
BackgroundLow-depth sequencing allows researchers to increase sample size at the expense of lower accuracy. To incorporate uncertainties while maintaining statistical power, we introduce MCPCA_PopGen to analyze population structure of low-depth sequencing data.ResultsThe method optimizes the choice of nonlinear transformations of dosages to maximize the Ky Fan norm of the covariance matrix. The transformation incorporates the uncertainty in calling between heterozygotes and the common homozygotes for loci having a rare allele and is more linear when both variants are common.ConclusionsWe apply MCPCA_PopGen to samples from two indigenous Siberian populations and reveal hidden population structure accurately using only a single chromosome. The MCPCA_PopGen package is available on https://github.com/yiwenstat/MCPCA_PopGen.
【 授权许可】
CC BY
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202107221798351ZK.pdf | 2173KB | download |