期刊论文详细信息
G3: Genes, Genomes, Genetics
Alignment-Free Population Genomics: An Efficient Estimator of Sequence Diversity
Bernhard Haubold1  Peter Pfaffelhuber2 
[1] Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, GermanyDepartment of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, GermanyDepartment of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, 24306 Plön, Germany;Mathematical Stochastics, Mathematical Institute, Albert-Ludwigs University, 79085 Freiburg, GermanyMathematical Stochastics, Mathematical Institute, Albert-Ludwigs University, 79085 Freiburg, GermanyMathematical Stochastics, Mathematical Institute, Albert-Ludwigs University, 79085 Freiburg, Germany
关键词: genetic diversity;    alignment-free;    maximum-likelihood;    Drosophila;    match length distribution;   
DOI  :  10.1534/g3.112.002527
学科分类:生物科学(综合)
来源: Genetics Society of America
PDF
【 摘 要 】

Comparative sequencing contributes critically to the functional annotation of genomes. One prerequisite for successful analysis of the increasingly abundant comparative sequencing data is the availability of efficient computational tools. We present here a strategy for comparing unaligned genomes based on a coalescent approach combined with advanced algorithms for indexing sequences. These algorithms are particularly efficient when analyzing large genomes, as their run time ideally grows only linearly with sequence length. Using this approach, we have derived and implemented a maximum-likelihood estimator of the average number of mismatches per site between two closely related sequences, π. By allowing for fluctuating coalescent times, we are able to improve a previously published alignment-free estimator of π. We show through simulation that our new estimator is fast and accurate even with moderate recombination (ρ ≤ π). To demonstrate its applicability to real data, we compare the unaligned genomes of Drosophila persimilis and D. pseudoobscura. In agreement with previous studies, our sliding window analysis locates the global divergence minimum between these two genomes to the pericentromeric region of chromosome 3.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO201912010200478ZK.pdf 832KB PDF download
  文献评价指标  
  下载次数:23次 浏览次数:13次