PeerJ | |
Local ancestry prediction with PyLAE | |
article | |
Nikita Moshkov1  Aleksandr Smetanin5  Tatiana V. Tatarinova6  | |
[1] Doctoral School of Interdisciplinary Medicine, University of Szeged;Synthetic and Systems Biology Unit, Biological Research Centre;Atlas Biomed Group Limited;Laboratory on AI for Computational Biology, Faculty of Computer Science, HSE University;ITMO University;Department of Biology, University of La Verne;Siberian Federal University;Institute of General Genetics;Institute for Information Transmission Problems | |
关键词: Local ancestry; HMM; Global ancestry; Bio-origin; Selection signals; 1000 Genomes; | |
DOI : 10.7717/peerj.12502 | |
学科分类:社会科学、人文和艺术(综合) | |
来源: Inra | |
【 摘 要 】
Summary We developed PyLAE, a new tool for determining local ancestry along a genome using whole-genome sequencing data or high-density genotyping experiments. PyLAE can process an arbitrarily large number of ancestral populations (with or without an informative prior). Since PyLAE does not involve estimating many parameters, it can process thousands of genomes within a day. PyLAE can run on phased or unphased genomic data. We have shown how PyLAE can be applied to the identification of differentially enriched pathways between populations. The local ancestry approach results in higher enrichment scores compared to whole-genome approaches. We benchmarked PyLAE using the 1000 Genomes dataset, comparing the aggregated predictions with the global admixture results and the current gold standard program RFMix. Computational efficiency, minimal requirements for data pre-processing, straightforward presentation of results, and ease of installation make PyLAE a valuable tool to study admixed populations. Availability and implementation The source code and installation manual are available at https://github.com/smetam/pylae.
【 授权许可】
CC BY
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202307100004824ZK.pdf | 5179KB | download |