期刊论文详细信息
Algorithms for Molecular Biology
Estimating population size via line graph reconstruction
Bjarni V Halldórsson1  Dima Blokh2  Roded Sharan2 
[1] School of Science and Engineering, Reykjavík University, Menntavegur 1, Reykjavik 101, Iceland
[2] Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
关键词: Integer programming;    Line graphs;    Haplotypes;    Population size;   
Others  :  793282
DOI  :  10.1186/1748-7188-8-17
 received in 2012-12-20, accepted in 2013-06-29,  发布年份 2013
PDF
【 摘 要 】

Background

We propose a novel graph theoretic method to estimate haplotype population size from genotype data. The method considers only the potential sharing of haplotypes between individuals and is based on transforming the graph of potential haplotype sharing into a line graph using a minimum number of edge and vertex deletions.

Results

We show that the resulting line graph deletion problems are NP complete and provide exact integer programming solutions for them. We test our approach using extensive simulations of multiple population evolution and genotypes sampling scenarios. Our results also indicate that the method may be useful in comparing populations and it may be used as a first step in a method for haplotype phasing.

Conclusions

Our computational experiments show that when most of the sharings are true sharings the problem can be solved very fast and the estimated size is very close to the true size; when many of the potential sharings do not stem from true haplotype sharing, our method gives reasonable lower bounds on the underlying number of haplotypes. In comparison, a naive approach of phasing the input genotypes provides trivial upper bounds of twice the number of genotypes.

【 授权许可】

   
2013 Halldórsson et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140705045836127.pdf 216KB PDF download
【 参考文献 】
  • [1]Halldórsson BV, Bafna V, Edwards N, Lippert R, Yooseph S, Istrail S: A survey of computational methods for determining haplotypes. In Computational Methods for SNPs and Haplotype Inference (LNCS 2983). Springer; 2004:26-47. [http://citeseerx.ist.psu.edu/viewdoc/summary? webcite]
  • [2]Catanzaro D, Godi A, Labbé M: A class representative model for pure Parsimony Haplotyping. INFORMS J Comput 2009, 22(2):195-209.
  • [3]Halldórsson BV, Aguiar D, Tarpine R, Istrail S: The Clark Phaseable sample size problem: long-range phasing and loss of heterozygosity in GWAS. J Comput Biol 2011, 18(3):323-333.
  • [4]Prabhu S, Pe’er I: Overlapping pools for high-throughput targeted resequencing. Genome Res 2009, 19:1254-61.
  • [5]The International HapMap Consortium: Integrating common and rare genetic variation in diverse human populations. Nature 2010, 467:52-58.
  • [6]Whitney H: Congruent graphs and the connectivity of graphs. Am J Math 1932, 54:150-162.
  • [7]Lehot PGH: An optimal algorithm to detect a line graph and output its root graph. J ACM 1974, 21:569-575.
  • [8]Roussopoulos N: A max(m,n) algorithm for determining the graph H from its line graph G. Inf Process Lett 1974, 2:108-112.
  • [9]Van Rooij A, Wilf H: The interchange graphs of a finite graph. Acta Math Acad Sci Hungar 1965, 16:263-269.
  • [10]Clark A: Inference of haplotypes from PCR-amplified samples of diploid populations. Mol Biol Evol 1990, 7:111-122.
  • [11]Yannakakis M: Node-and edge-deletion NP-complete problems. In Proceedings of the tenth annual ACM symposium on Theory of computing,. STOC ’78. New York: ACM; 1978:253-264..
  • [12]Cai L: Fixed-parameter tractability of graph modification problems for hereditary properties. Inf Process Lett 1996, 58:171-176.
  • [13]Niedermeier R, Rossmanith P: An efficient fixed-parameter algorithm for 3-Hitting Set. J Discrete Algorithms 2003, 1:89-102.
  • [14]Trevisan L: Non-approximability results for optimization problems on bounded degree instances. In Proceedings of the Thirty-Third Annual ACM Symposium on Theory of Computing.. ACM; 2001:453-461.
  • [15]Even S, Bar-Yehuda R: A linear-time approximation algorithm for the weighted vertex cover problem. J Algorithms 1981, 2(2):198-203.
  • [16]Campelo M, Campos V, Correa R: On the asymmetric representatives formulation for the vertex coloring problem. Discrete Appl Math 2008, 156(7):1097-1111.
  • [17]Hudson RR: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics 2002, 18(2):337-338.
  • [18]Browning BL, Browning SR: A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Human Genet 2009, 84(2):210-223.
  文献评价指标  
  下载次数:12次 浏览次数:13次