学位论文详细信息
Statistical issues in modelling the ancestry from Y-chromosome and surname data
QH426 Genetics;HA Statistics
Sharif, Maarya ; Macaulay, Vincent A.
University:University of Glasgow
Department:School of Mathematics and Statistics
关键词: Y-chromosome, surname, most recent common ancestor,haplotype, haplogroup, British, genealogy, short tandem repeat, generation.;   
Others  :  http://theses.gla.ac.uk/3407/1/2012sharifphd.pdf
来源: University of Glasgow
PDF
【 摘 要 】

A considerable industry has grown-up around genealogical inference from genetic testing, supplementing more traditional genealogical techniques but with very limited quantification of uncertainty. In many societies Y-chromosomes are co-inherited with surnames and as such passed down from father to son. This thesis seeks to explore what the correlation can say about ancestry. In particular it is concerned with estimation of the time to the most recent common paternal ancestor (TMRCA) for pairs of males who are not known to be directly related but share the same surname, based on the repeat number at short tandem repeat (STR) markers on their Y-chromosomes.We develop a model of TMRCA estimation based on the difference in repeat numbers in pairs of male haplotypes using a Bayesian framework andMarkov-Chain Monte-Carlo techniques, such as adaptive Metropolis-Hastings algorithm. The model incorporates the process of STR discovery and the calibration of mutation rates, which can differ across STRs. In simulation studies, we find that the estimates of TMRCA are rather robust to the ascertainment process and the way in which it is modelled. However, they are affected by the site-specific mutation rates at the typed STRs. Indeed sequencing the fastest mutating STRs yields a lower error in the estimated TMRCA than random STRs. In the British context, we extend our model to include additional information such as the haplogroup status (as determined from single nucleotide polymorphisms, SNPs) of the pair of males, as well as the frequency and origin of the surname. In general, the effect of this is to reduce estimates of the TMRCA for pairs of males with an older TMRCA, typically outwith the period of surname establishment (about 500-700 years ago). In the genealogical context, incorporating surname frequency (within the prior distribution) results in lower estimates of TMRCA for pairs of males who appear to have diverged from a common male ancestor since the period of surname establishment. In addition, we include uncertainty in the years per generation conversion factor in our model.

【 预 览 】
附件列表
Files Size Format View
Statistical issues in modelling the ancestry from Y-chromosome and surname data 4627KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:17次