| BMC Bioinformatics | |
| KRLMM: an adaptive genotype calling method for common and low frequency variants | |
| Methodology Article | |
| Meredith Yeager1  Rafael A Irizarry2  Ruijie Liu3  Zhiyin Dai3  Matthew E Ritchie4  | |
| [1] Cancer Genomics Research Laboratory, SAIC-Frederick, Inc., NCI-Frederick, 20877, Frederick, Maryland, USA;Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, CLSB 11007, 450 Brookline Ave, 02215, Boston, Massachusetts, USA;Molecular Medicine Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, 3052, Parkville, Victoria, Australia;Molecular Medicine Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, 3052, Parkville, Victoria, Australia;Department of Mathematics and Statistics, The University of Melbourne, 3010, Parkville, Victoria, Australia;Department of Medical Biology, The University of Melbourne, 3010, Parkville, Victoria, Australia; | |
| 关键词: Genotyping; Clustering; Microarray data analysis; | |
| DOI : 10.1186/1471-2105-15-158 | |
| received in 2014-02-24, accepted in 2014-05-19, 发布年份 2014 | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
BackgroundSNP genotyping microarrays have revolutionized the study of complex disease. The current range of commercially available genotyping products contain extensive catalogues of low frequency and rare variants. Existing SNP calling algorithms have difficulty dealing with these low frequency variants, as the underlying models rely on each genotype having a reasonable number of observations to ensure accurate clustering.ResultsHere we develop KRLMM, a new method for converting raw intensities into genotype calls that aims to overcome this issue. Our method is unique in that it applies careful between sample normalization and allows a variable number of clusters k (1, 2 or 3) for each SNP, where k is predicted using the available data. We compare our method to four genotyping algorithms (GenCall, GenoSNP, Illuminus and OptiCall) on several Illumina data sets that include samples from the HapMap project where the true genotypes are known in advance. All methods were found to have high overall accuracy (> 98%), with KRLMM consistently amongst the best. At low minor allele frequency, the KRLMM, OptiCall and GenoSNP algorithms were observed to be consistently more accurate than GenCall and Illuminus on our test data.ConclusionsMethods that tailor their approach to calling low frequency variants by either varying the number of clusters (KRLMM) or using information from other SNPs (OptiCall and GenoSNP) offer improved accuracy over methods that do not (GenCall and Illuminus). The KRLMM algorithm is implemented in the open-source crlmm package distributed via the Bioconductor project (http://www.bioconductor.org).
【 授权许可】
Unknown
© Liu et al.; licensee BioMed Central Ltd. 2014. This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202311107432161ZK.pdf | 1618KB |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
- [10]
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
- [17]
- [18]
- [19]
- [20]
- [21]
- [22]
- [23]
- [24]
- [25]
- [26]
PDF