BMC Bioinformatics | |
hsphase: an R package for pedigree reconstruction, detection of recombination events, phasing and imputation of half-sib family groups | |
Mohammad H Ferdosi1  Brian P Kinghorn1  Julius HJ van der Werf1  Seung Hwan Lee2  Cedric Gondro1  | |
[1] The Centre for Genetic Analysis and Applications, School of Environmental and Rural Science, University of New England, Armidale, Australia | |
[2] Hanwoo Experiment Station, National Institute of Animal Science, RDA, Pyeongchang, Korea | |
关键词: Pedigree reconstruction; Parentage testing; Genotyping; Linkage analysis; Haplotypes; Recombination; Imputation; Phasing; SNP; | |
Others : 818471 DOI : 10.1186/1471-2105-15-172 |
|
received in 2013-10-11, accepted in 2014-05-27, 发布年份 2014 | |
【 摘 要 】
Background
Identification of recombination events and which chromosomal segments contributed to an individual is useful for a number of applications in genomic analyses including haplotyping, imputation, signatures of selection, and improved estimates of relationship and probability of identity by descent. Genotypic data on half-sib family groups are widely available in livestock genomics. This structure makes it possible to identify recombination events accurately even with only a few individuals and it lends itself well to a range of applications such as parentage assignment and pedigree verification.
Results
Here we present hsphase, an R package that exploits the genetic structure found in half-sib livestock data to identify and count recombination events, impute and phase un-genotyped sires and phase its offspring. The package also allows reconstruction of family groups (pedigree inference), identification of pedigree errors and parentage assignment. Additional functions in the package allow identification of genomic mapping errors, imputation of paternal high density genotypes from low density genotypes, evaluation of phasing results either from hsphase or from other phasing programs. Various diagnostic plotting functions permit rapid visual inspection of results and evaluation of datasets.
Conclusion
The hsphase package provides a suite of functions for analysis and visualization of genomic structures in half-sib family groups implemented in the widely used R programming environment. Low level functions were implemented in C++ and parallelized to improve performance. hsphase was primarily designed for use with high density SNP array data but it is fast enough to run directly on sequence data once they become more widely available. The package is available (GPL 3) from the Comprehensive R Archive Network (CRAN) or from http://www-personal.une.edu.au/~cgondro2/hsphase.htm webcite.
【 授权许可】
2014 Ferdosi et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20140711103250341.pdf | 3176KB | download | |
Figure 10. | 27KB | Image | download |
Figure 9. | 78KB | Image | download |
Figure 8. | 84KB | Image | download |
Figure 7. | 6KB | Image | download |
Figure 6. | 92KB | Image | download |
Figure 5. | 51KB | Image | download |
Figure 4. | 73KB | Image | download |
Figure 3. | 14KB | Image | download |
Figure 2. | 31KB | Image | download |
Figure 1. | 40KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
Figure 6.
Figure 7.
Figure 8.
Figure 9.
Figure 10.
【 参考文献 】
- [1]Edwards D: Modelling and visualizing fine-scale linkage disequilibrium structure. BMC bioinformatics 2013, 14:179. BioMed Central Full Text
- [2]Su SY, Kasberger J, Baranzini S, Byerley W, Liao W, Oksenberg J, Sherr E, Jorgenson E: Detection of identity by descent using next-generation whole genome sequencing data. BMC bioinformatics 2012, 13:121. BioMed Central Full Text
- [3]Gondro C, van der Werf J, Hayes B: Genome-Wide Association Studies and Genomic Prediction, Volume 1019. Springer: Humana Press; 2013.
- [4]Meuwissen T, Goddard M: Accurate Prediction of Genetic Values for Complex Traits by Whole-Genome Resequencing. Genetics 2010, 185(2):623-U338.
- [5]Browning SR, Browning BL: Haplotype phasing: existing methods and new developments. Nat Rev Genet 2011, 12(10):703-714.
- [6]Druet T, Macleod IM, Hayes BJ: Toward genomic prediction from whole-genome sequence data: impact of sequencing design on genotype imputation and accuracy of predictions. Heredity 2014, 112(1):39-47.
- [7]Ferdosi MH, Kinghorn BP, van der Werf JH, Gondro C: Detection of recombination events, haplotype reconstruction and imputation of sires using half-sib SNP genotypes. Genet Sel Evol 2014, 46:11. BioMed Central Full Text
- [8]Efros A, Halperin E: Haplotype reconstruction using perfect phylogeny and sequence data. BMC bioinformatics 2012, 13(Suppl 6):S3. BioMed Central Full Text
- [9]He D, Choi A, Pipatsrisawat K, Darwiche A, Eskin E: Optimal algorithms for haplotype assembly from whole-genome sequence data. Bioinformatics 2010, 26(12):i183-i190.
- [10]Hoze C, Fouilloux MN, Venot E, Guillaume F, Dassonneville R, Fritz S, Ducrocq V, Phocas F, Boichard D, Croiseau P: High-density marker imputation accuracy in sixteen French cattle breeds. Genet Sel Evol 2013, 45:33. BioMed Central Full Text
- [11]Hayes BJ: Efficient parentage assignment and pedigree reconstruction with dense single nucleotide polymorphism data. J Dairy Sci 2011, 94(4):2114-2117.
- [12]Calus MPL, Mulder HA, Bastiaansen JWM: Identification of Mendelian inconsistencies between SNP and pedigree information of sibs. Genet Sel Evol 2011, 43:34. BioMed Central Full Text
- [13]Gondro C, Lee SH, Lee HK, Porto-Neto LR: Quality control for genome-wide association studies. Methods Mol Biol 2013, 1019:129-147.
- [14]Ferdosi MH, Kinghorn B, van der Werf J, Gondro C: Effect of genotype and pedigree error on block partitioning, sire imputation and haplotype inference using the hsphase algorithm. In AAABG Proceeding. Napier, New Zealand; 2013.
- [15]Gondro C, Porto-Neto LR, Lee SH: R for genome-wide association studies. Methods Mol Biol 2013, 1019:1-18.
- [16]The R Development Core Team: R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing; 2014.
- [17]Knaus J, snowfall: Easier cluster computing (based on snow).. R package version 1.84-6.
- [18]Eddelbuettel D, Francois R: Rcpp: seamless R and C++ integration. J STAT SOFTW 2011, 40(8):1-18.
- [19]Eddelbuettel D: Seamless R and C++ integration with Rcpp, Volume 64. New York: Springer; 2013.
- [20]Eddelbuettel D, Sanderson C: RcppArmadillo: Accelerating R with high-performance C++ linear algebra. COMPUT STAT DATA AN 2014, 71:1054-1063.