BMC Bioinformatics | |
GENLIB: an R package for the analysis of genealogical data | |
Héloïse Gauvin3  Jean-François Lefebvre3  Claudia Moreau3  Eve-Marie Lavoie2  Damian Labuda4  Hélène Vézina2  Marie-Hélène Roy-Gagnon1  | |
[1] School of Epidemiology, Public Health and Preventive Medicine, Faculty of Medicine, University of Ottawa, 600 Peter Morand Cres, Room 101E, Ottawa K1G 5Z3, ON, Canada | |
[2] BALSAC Project, Université du Québec à Chicoutimi, Chicoutimi, Québec, Canada | |
[3] Centre de recherche, Centre hospitalier universitaire Sainte-Justine, Montréal, Québec, Canada | |
[4] Département de pédiatrie, Université de Montréal, Montréal, Québec, Canada | |
关键词: Gene-dropping simulations; Genetics; Inbreeding; Kinship; Historical demography; Software; Founder populations; Genealogical data; | |
Others : 1232567 DOI : 10.1186/s12859-015-0581-5 |
|
received in 2014-12-19, accepted in 2015-04-22, 发布年份 2015 |
【 摘 要 】
Background
Founder populations have an important role in the study of genetic diseases. Access to detailed genealogical records is often one of their advantages. These genealogical data provide unique information for researchers in evolutionary and population genetics, demography and genetic epidemiology. However, analyzing large genealogical datasets requires specialized methods and software. The GENLIB software was developed to study the large genealogies of the French Canadian population of Quebec, Canada. These genealogies are accessible through the BALSAC database, which contains over 3 million records covering the whole province of Quebec over four centuries. Using this resource, extended pedigrees of up to 17 generations can be constructed from a sample of present-day individuals.
Results
We have extended and implemented GENLIB as a package in the R environment for statistical computing and graphics, thus allowing optimal flexibility for users. The GENLIB package includes basic functions to manage genealogical data allowing, for example, extraction of a part of a genealogy or selection of specific individuals. There are also many functions providing information to describe the size and complexity of genealogies as well as functions to compute standard measures such as kinship, inbreeding and genetic contribution. GENLIB also includes functions for gene-dropping simulations.
The goal of this paper is to present the full functionalities of GENLIB. We used a sample of 140 individuals from the province of Quebec (Canada) to demonstrate GENLIB’s functions. Ascending genealogies for these individuals were reconstructed using BALSAC, yielding a large pedigree of 41,523 individuals. Using GENLIB’s functions, we provide a detailed description of these genealogical data in terms of completeness, genetic contribution of founders, relatedness, inbreeding and the overall complexity of the genealogical tree. We also present gene-dropping simulations based on the whole genealogy to investigate identical-by-descent sharing of alleles and chromosomal segments of different lengths and estimate probabilities of identical-by-descent sharing.
Conclusions
The R package GENLIB provides a user friendly and flexible environment to analyze extensive genealogical data, allowing an efficient and easy integration of different types of data, analytical methods and additional developments and making this tool ideal for genealogical analysis.
【 授权许可】
2015 Gauvin et al.; licensee BioMed Central.
Files | Size | Format | View |
---|---|---|---|
Figure 4. | 53KB | Image | download |
Figure 3. | 36KB | Image | download |
Figure 2. | 37KB | Image | download |
Figure 1. | 15KB | Image | download |
Figure 4. | 53KB | Image | download |
Figure 3. | 36KB | Image | download |
Figure 2. | 37KB | Image | download |
Figure 1. | 15KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 1.
Figure 2.
Figure 3.
Figure 4.
【 参考文献 】
- [1]Bourgain C, Génin E. Complex trait mapping in isolated populations: are specific statistical methods required? Eur J Hum Genet. 2005; 13:698-706.
- [2]Gulcher JR, Kong A, Stefansson K. The role of linkage studies for common diseases. Curr Opin Genet Dev. 2001; 11:264-7.
- [3]McKusick VA. Medical genetic studies of the Amish: selected papers. Johns Hopkins University Press, Baltimore; 1978.
- [4]Agarwala R, Biesecker LG, Schäffer AA. Anabaptist genealogy database. Am J Med Genet C: Semin Med Genet. 2003; 121C:32-7.
- [5]Morton DH, Morton CS, Strauss KA, Robinson DL, Puffenberger EG, Hendrickson C et al.. Pediatric medicine and the genetic disorders of the Amish and Mennonite people of Pennsylvania. Am J Med Genet C: Semin Med Genet. 2003; 121C:5-17.
- [6]Roy-Gagnon M-H, Weir MR, Sorkin JD, Ryan KA, Sack PA, Hines S et al.. Genetic influences on blood pressure response to the cold pressor test: results from the heredity and phenotype intervention heart study. J Hypertens. 2008; 26:729-36.
- [7]Newman DL, Abney M, McPeek MS, Ober C, Cox NJ. The importance of genealogy in determining genetic associations with complex traits. Am J Hum Genet. 2001; 69:1146-8.
- [8]Slattery ML, Kerber RA. A comprehensive evaluation of family history and breast cancer risk. The Utah Population Database JAMA. 1993; 270:1563-8.
- [9]Uimari P, Kontkanen O, Visscher PM, Pirskanen M, Fuentes R, Salonen JT. Genome-wide linkage disequilibrium from 100,000 SNPs in the East Finland founder population. Twin Res Hum Genet. 2005; 8:185-97.
- [10]Peltonen L, Jalanko A, Varilo T. Molecular genetics of the Finnish disease heritage. Hum Mol Genet. 1999;8:1913–23.
- [11]Rahman P, Jones A, Curtis J, Bartlett S, Peddle L, Fernandez BA et al.. The Newfoundland population: a unique resource for genetic investigation of complex diseases. Hum Mol Genet. 2003; 12 Spec No:R167-72.
- [12]Ciullo M, Bellenguez C, Colonna V, Nutile T, Calabria A, Pacente R et al.. New susceptibility locus for hypertension on chromosome 8q by efficient pedigree-breaking in an Italian isolate. Hum Mol Genet. 2006; 15:1735-43.
- [13]Moreau C, Lefebvre J-F, Jomphe M, Bhérer C, Ruiz-Linares A, Vézina H et al.. Native American admixture in the Quebec founder population. PLoS One. 2013; 8:e65507.
- [14]Moreau C, Bhérer C, Vézina H, Jomphe M, Labuda D, Excoffier L. Deep human genealogies reveal a selective advantage to be on an expanding wave front. Science. 2011; 334:1148-50.
- [15]Vézina H, Durocher F, Dumont M, Houde L, Szabo C, Tranchant M et al.. Molecular and genealogical characterization of the R1443X BRCA1 mutation in high-risk French-Canadian breast/ovarian cancer families. Hum Genet. 2005; 117:119-32.
- [16]Larmuseau MHD, Van Geystelen A, van Oven M, Decorte R. Genetic genealogy comes of age: perspectives on the use of deep-rooted pedigrees in human population genetics. Am J Phys Anthropol. 2013; 150:505-11.
- [17]Kristiansson K, Naukkarinen J, Peltonen L. Isolated populations and complex disease gene identification. Genome Biol. 2008; 9:109. BioMed Central Full Text
- [18]Dyke B. PEDSYS: a pedigree data management system user’s manual. San Antonio: Texas Southwest Foundation for Biomedical Research, Population Genetics Laboratory Technical Report No. 2. 1999;368.
- [19]Agarwala R, Biesecker LG, Hopkins KA, Francomano CA, Schäffer AA. Software for constructing and verifying pedigrees within large genealogies and an application to the Old order Amish of Lancaster county. Genome Res. 1998; 8:211-21.
- [20]Lee W-J, Pollin TI, O’Connell JR, Agarwala R, Schäffer AA. PedHunter 2.0 and its usage to characterize the founder structure of the old order Amish of Lancaster county. BMC Med Genet. 2010; 11:68. BioMed Central Full Text
- [21]Scriver CR. Human genetics: lessons from Quebec populations. Annu Rev Genomics Hum Genet. 2001; 2:69-101.
- [22]Laberge A-M, Michaud J, Richter A, Lemyre E, Lambert M, Brais B et al.. Population history and its impact on medical genetics in Quebec. Clin Genet. 2005; 68:287-301.
- [23]Quebec Reference Sample: Population Genetics and Genetic Epidemiology in Quebec [http://www.quebecgenpop.ca/]
- [24]Gagnon A, Heyer E. Fragmentation of the Québec population genetic pool (Canada): evidence from the genetic contribution of founders per region in the 17th and 18th centuries. Am J Phys Anthropol. 2001; 114:30-41.
- [25]Statistics Canada. Census Profile. Census 2011. Ottawa: Canada; 2012(no. 98-316-XWE).
- [26]BALSAC Population databse2 [http://www.balsac.uqac.ca]
- [27]Tremblay M, Letendre M, Houde L, Vézina H. The contribution of Irish immigrants to the Quebec (Canada) gene pool: an estimation using data from deep-rooted genealogies. Eur J Popul / Rev Eur Démographie. 2008; 25:215-33.
- [28]Roy-Gagnon M-H, Moreau C, Bherer C, St-Onge P, Sinnett D, Laprise C et al.. Genomic and genealogical investigation of the French Canadian founder population structure. Hum Genet. 2011; 129:521-31.
- [29]Bherer C, Labuda D, Roy-Gagnon M-H, Houde L, Tremblay M, Vézina H. Admixed ancestry and stratification of Quebec regional populations. Am J Phys Anthropol. 2011; 144:432-41.
- [30]Tremblay M, Bouhali T, Gaudet D, Brisson D. Genealogical analysis as a new approach for the investigation of drug intolerance heritability. Eur J Hum Genet. 2014; 22:916-22.
- [31]Gauvin H, Moreau C, Lefebvre J-F, Laprise C, Vézina H, Labuda D et al.. Genome-wide patterns of identity-by-descent sharing in the French Canadian founder population. Eur J Hum Genet. 2014; 22:814-21.
- [32]MacCluer JW, Vandeberg JL, Read B, Ryder OA. Pedigree analysis by computer simulation. Zoo Biol. 1986; 5:147-60.
- [33]Cazes P, Cazes M-H. Comment mesurer la profondeur généalogique d’une ascendance? Popul (Fr Ed). 1996; 51:117.
- [34]Kouladjian K. Une Mesure D’entropie Généalogique. SOREP. Chicoutimi: Programme de recherches en génétique humaine; 1986:1–4.
- [35]Karigl G. A recursive algorithm for the calculation of identity coefficients. Ann Hum Genet. 1981; 45:299-305.
- [36]Malécot G. Les Mathématiques de L’hérédité. Masson, Paris; 1948.
- [37]Thompson EA. Pedigree analysis in human genetics. Johns Hopkins University Press, Baltimore; 1986.
- [38]Roberts DF. Genetic effects of population size reduction. Nature. 1968; 220:1084-8.
- [39]O’Brien E, Jorde LB, Rönnlöf B, Fellman JO, Eriksson AW. Founder effect and genetic disease in Sottunga, Finland. Am J Phys Anthropol. 1988; 77:335-46.
- [40]Lange K, Papp JC, Sinsheimer JS, Sripracha R, Zhou H, Sobel EM. Mendel: the Swiss army knife of genetic analysis programs. Bioinformatics. 2013; 29:1568-70.
- [41]Donnelly KP. The probability that related individuals share some section of genome identical by descent. Theor Popul Biol. 1983; 23:34-63.
- [42]Hill WG. Variation in genetic identity within kinships. Heredity (Edinb). 1993; 71:652-3.
- [43]Kong A, Thorleifsson G, Gudbjartsson DF, Masson G, Sigurdsson A, Jonasdottir A et al.. Fine-scale recombination rate differences between sexes, populations and individuals. Nature. 2010; 467:1099-103.
- [44]Chetaille P, Preuss C, Burkhard S, Côté J-M, Houde C, Castilloux J et al.. Mutations in SGOL1 cause a novel cohesinopathy affecting heart and gut rhythm. Nat Genet. 2014; 46:1245-9.
- [45]Speed D, Balding DJ. Relatedness in the post-genomic era: is it still useful? Nat Rev Genet. 2015; 16:33-44.
- [46]Kong A, Masson G, Frigge ML, Gylfason A, Zusmanovich P, Thorleifsson G et al.. Detection of sharing by descent, long-range phasing and haplotype imputation. Nat Genet. 2008; 40:1068-75.