期刊论文

【摘要】

The detection of ancient gene flow between human populations is an important issue in population genetics. A common tool for detecting ancient admixture events is the D-statistic. The D-statistic is based on the hypothesis of a genetic relationship that involves four populations, whose correctness is assessed by evaluating specific coincidences of alleles between the groups. When working with high-throughput sequencing data, calling genotypes accurately is not always possible; therefore, the D-statistic currently samples a single base from the reads of one individual per population. This implies ignoring much of the information in the data, an issue especially striking in the case of ancient genomes. We provide a significant improvement to overcome the problems of the D-statistic by considering all reads from multiple individuals in each population. We also apply type-specific error correction to combat the problems of sequencing errors, and show a way to correct for introgression from an external population that is not part of the supposed genetic relationship, and how this leads to an estimate of the admixture rate. We prove that the D-statistic is approximated by a standard normal distribution. Furthermore, we show that our method outperforms the traditional D-statistic in detecting admixtures. The power gain is most pronounced for low and medium sequencing depth (1–10×), and performances are as good as with perfectly called genotypes at a sequencing depth of 2×. We show the reliability of error correction in scenarios with simulated errors and ancient data, and correct for introgression in known scenarios to estimate the admixture rates.

【授权许可】

CC BY|CC BY-NC

【预览】

附件列表
Files	Size	Format	View
RO201907120006172ZK.pdf	1244KB	PDF	download

G3: Genes, Genomes, Genetics
Powerful Inference with the D-Statistic on Low-Coverage Whole-Genome Data
article
Samuele Soraggi¹ Carsten Wiuf¹ Anders Albrechtsen²
[1] Department of Mathematical Sciences, Faculty of Science, University of Copenhagen, 2100, Denmark;Center for Bioinformatics, Faculty of Science, University of Copenhagen, 2100, Denmark
关键词: admixture; gene flow; introgression; D-statistic; ABBA–BABA test; tree test; four-population test; ANGSD; next-generation sequencing data; low depth;
DOI : 10.1534/g3.117.300192
学科分类：社会科学、人文和艺术（综合）
来源: Genetics Society of America
PDF


	文献评价指标
	下载次数：12次	浏览次数：2次

【 摘 要 】

【 授权许可】

【 预 览 】

【摘要】

【授权许可】

【预览】