期刊论文详细信息
BMC Proceedings
Evaluating methods for the analysis of rare variants in sequence data
Proceedings
Airat Bekmetjev1  Nathan L Tintle1  Alexandra Sitarik2  Scott Powers3  Ashley Petersen4  Alexander Luedtke5 
[1] Department of Mathematics, Computer Science and Statistics, Dordt College, 498 4th Ave NE, 51250, Sioux Center, IA, USA;Department of Mathematics, Wittenberg University, PO Box 720, 200 West Ward Street, 45501, Springfield, OH, USA;Department of Statistics and Operations Research, University of North Carolina, 318 Hanes Hall, CB 3260, 27599-3260, Chapel Hill, NC, USA;Departments of Mathematics, Computer Science, and Statistics, St. Olaf College, 1520 St. Olaf Avenue, 55057, Northfield, MN, USA;Division of Applied Mathematics, Brown University, 182 George Street, 02912, Providence, RI, USA;
关键词: Minor Allele Frequency;    Population Stratification;    Causal SNPs;    Nonsynonymous SNPs;    Simulated Phenotype;   
DOI  :  10.1186/1753-6561-5-S9-S119
来源: Springer
PDF
【 摘 要 】

A number of rare variant statistical methods have been proposed for analysis of the impending wave of next-generation sequencing data. To date, there are few direct comparisons of these methods on real sequence data. Furthermore, there is a strong need for practical advice on the proper analytic strategies for rare variant analysis. We compare four recently proposed rare variant methods (combined multivariate and collapsing, weighted sum, proportion regression, and cumulative minor allele test) on simulated phenotype and next-generation sequencing data as part of Genetic Analysis Workshop 17. Overall, we find that all analyzed methods have serious practical limitations on identifying causal genes. Specifically, no method has more than a 5% true discovery rate (percentage of truly causal genes among all those identified as significantly associated with the phenotype). Further exploration shows that all methods suffer from inflated false-positive error rates (chance that a noncausal gene will be identified as associated with the phenotype) because of population stratification and gametic phase disequilibrium between noncausal SNPs and causal SNPs. Furthermore, observed true-positive rates (chance that a truly causal gene will be identified as significantly associated with the phenotype) for each of the four methods was very low (<19%). The combination of larger than anticipated false-positive rates, low true-positive rates, and only about 1% of all genes being causal yields poor discriminatory ability for all four methods. Gametic phase disequilibrium and population stratification are important areas for further research in the analysis of rare variant data.

【 授权许可】

Unknown   
© Luedtke et al; licensee BioMed Central Ltd. 2011. This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

【 预 览 】
附件列表
Files Size Format View
RO202311100941350ZK.pdf 314KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  文献评价指标  
  下载次数:5次 浏览次数:0次