期刊论文详细信息
BioData Mining
Global tests of P-values for multifactor dimensionality reduction models in selection of optimal number of target genes
Hongying Dai3  Madhusudan Bhandary1  Mara Becker2  J Steven Leeder2  Roger Gaedigk2  Alison A Motsinger-Reif4 
[1] Department of Mathematics, Columbus State University, 4225 University Avenue, Columbus, GA, 31907, USA
[2] Division of Clinical Pharmacology and Medical Toxicology, Department of Pediatrics, Children’s Mercy Hospital, 2401 Gillham Road, Kansas City, MO 64108, USA
[3] Department of Medical Research, Children’s Mercy Hospital, 2401 Gillham Road, Kansas City, MO 64108, USA
[4] Bioinformatics Research Center, Department of Statistics, North Carolina State University, 1 Lampe Dr, Raleigh, NC, 27695-7566, USA
关键词: Multifactor dimensionality reduction;    ReliefF;    Global tests;    P-value;   
Others  :  797287
DOI  :  10.1186/1756-0381-5-3
 received in 2012-01-17, accepted in 2012-04-19,  发布年份 2012
PDF
【 摘 要 】

Background

Multifactor Dimensionality Reduction (MDR) is a popular and successful data mining method developed to characterize and detect nonlinear complex gene-gene interactions (epistasis) that are associated with disease susceptibility. Because MDR uses a combinatorial search strategy to detect interaction, several filtration techniques have been developed to remove genes (SNPs) that have no interactive effects prior to analysis. However, the cutoff values implemented for these filtration methods are arbitrary, therefore different choices of cutoff values will lead to different selections of genes (SNPs).

Methods

We suggest incorporating a global test of p-values to filtration procedures to identify the optimal number of genes/SNPs for further MDR analysis and demonstrate this approach using a ReliefF filter technique. We compare the performance of different global testing procedures in this context, including the Kolmogorov-Smirnov test, the inverse chi-square test, the inverse normal test, the logit test, the Wilcoxon test and Tippett’s test. Additionally we demonstrate the approach on a real data application with a candidate gene study of drug response in Juvenile Idiopathic Arthritis.

Results

Extensive simulation of correlated p-values show that the inverse chi-square test is the most appropriate approach to be incorporated with the screening approach to determine the optimal number of SNPs for the final MDR analysis. The Kolmogorov-Smirnov test has high inflation of Type I errors when p-values are highly correlated or when p-values peak near the center of histogram. Tippett’s test has very low power when the effect size of GxG interactions is small.

Conclusions

The proposed global tests can serve as a screening approach prior to individual tests to prevent false discovery. Strong power in small sample sizes and well controlled Type I error in absence of GxG interactions make global tests highly recommended in epistasis studies.

【 授权许可】

   
2012 Dai et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140706050552397.pdf 573KB PDF download
Figure 5. 32KB Image download
Figure 4. 55KB Image download
Figure 3. 83KB Image download
Figure 2. 49KB Image download
Figure 1. 63KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

【 参考文献 】
  • [1]Moore JH, Williams SM: Traversing the conceptual divide between biological and statistical epistasis: systems biology and a more modern synthesis. Bioessays 2005, 27(6):637-646.
  • [2]Motsinger AA, Ritchie MD, Reif DM: Novel methods for detecting epistasis in pharmacogenomics studies. Pharmacogenomics 2007, 8(9):1229-1241.
  • [3]Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH: Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet 2001, 69(1):138-147.
  • [4]Moore JH: Detecting, characterizing, and interpreting nonlinear gene-gene interactions using multifactor dimensionality reduction. Adv Genet 2010, 72:101-116.
  • [5]Greene CS, Penrod NM, Kiralis J, Moore JH: Spatially uniform relieff (SURF) for computationally-efficient filtering of gene-gene interactions. BioData Min 2009, 2(1):5. BioMed Central Full Text
  • [6]Moore JH, White BC: Tuning relieff for genome-wide genetic analysis. Lecture Notes in Computer Science 2007, 4447:166-175.
  • [7]Robnik-Sikonja, Igor K: Theoretical and empirical analysis of ReliefF and RReliefF. Machine Learning Journal 2003, 53:23-69.
  • [8]Hahn LW, Ritchie MD, Moore JH: Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions. Bioinformatics 2003, 19(3):376-382.
  • [9]Winham SJ, Motsinger-Reif AA: An R package implementation of multifactor dimensionality reduction. BioData Min 2011, 4(1):24. BioMed Central Full Text
  • [10]Oki N, Motsinger-Reif A: Multifactor dimensionality reduction as a filter based approach for genome wide association studies. Frontiers in Genetics 2011, 2:80.
  • [11]Birnbaum ZW, Tingey FH: One-sided confidence contours for probability distribution functions. The Annals of Mathematical Statistics 1951, 22(4):592-596.
  • [12]Fisher RA: Statistical methods for research workers. Oliver & Boyd, London; 1932.
  • [13]Mudholkar GS, George EO: The logit statistic for combining probabilities - an overview. In Optimizing Methods in Statistics Edited by Rustagi JS. 1979, 345-365.
  • [14]Myles H, Wolfe DA: Nonparametric statistical methods. 2nd edition. Wiley, New York; 1999.
  • [15]Tippett L: The methods of statistics. Williams & Norgate, London; 1931.
  • [16]Wilkinson B: A statistical consideration in psychological research. Psychological Bulletin 1951, 48:156-158.
  • [17]Sakoda JM, Cohen BH, Beall G: Test of significance for a series of statistical tests. Psychological Bulletin 1954, 51(2):172-175.
  • [18]Helmick CG, Felson DT, Lawrence RC, Gabriel S, Hirsch R, Kwoh CK, Liang MH, Kremers HM, Mayes MD, Merkel PA, et al.: Estimates of the prevalence of arthritis and other rheumatic conditions in the United States. Part I. Arthritis Rheum 2008, 58(1):15-25.
  • [19]Becker ML, Rose CD, Cron RQ, Sherry DD, Bilker WB, Lautenbach E: Effectiveness and toxicity of methotrexate in juvenile idiopathic arthritis: comparison of 2 initial dosing regimens. J Rheumatol 2010, 37(4):870-875.
  • [20]Chabner BA, Allegra CJ, Curt GA, Clendeninn NJ, Baram J, Koizumi S, Drake JC, Jolivet J: Polyglutamation of methotrexate. Is methotrexate a prodrug? J Clin Invest 1985, 76(3):907-912.
  • [21]Stamp LK, Barclay ML, O'Donnell JL, Zhang M, Drake J, Frampton C, Chapman PT: Effects of changing from oral to subcutaneous methotrexate on red blood cell methotrexate polyglutamate concentrations and disease activity in patients with rheumatoid arthritis. J Rheumatol 2011, 38(12):2540-2547.
  • [22]Dolezalova P, Krijt J, Chladek J, Nemcova D, Hoza J: Adenosine and methotrexate polyglutamate concentrations in patients with juvenile arthritis. Rheumatology (Oxford) 2005, 44(1):74-79.
  • [23]Allegra CJ, Chabner BA, Drake JC, Lutz R, Rodbard D, Jolivet J: Enhanced inhibition of thymidylate synthase by methotrexate polyglutamates. J Biol Chem 1985, 260(17):9720-9726.
  • [24]Baggott JE, Vaughn WH, Hudson BB: Inhibition of 5-aminoimidazole-4-carboxamide ribotide transformylase, adenosine deaminase and 5'-adenylate deaminase by polyglutamates of methotrexate and oxidized folates and by 5-aminoimidazole-4-carboxamide riboside and ribotide. Biochem J 1986, 236(1):193-200.
  • [25]Cronstein BN, Naime D, Ostad E: The antiinflammatory mechanism of methotrexate. Increased adenosine release at inflamed sites diminishes leukocyte accumulation in an in vivo model of inflammation. J Clin Invest 1993, 92(6):2675-2682.
  • [26]Hinks A, Moncrieffe H, Martin P, Ursu S, Lal S, Kassoumeri L, Weiler T, Glass DN, Thompson SD, Wedderburn LR, et al.: Association of the 5-aminoimidazole-4-carboxamide ribonucleotide transformylase gene with response to methotrexate in juvenile idiopathic arthritis. Ann Rheum Dis 2011, 70(8):1395-1400.
  • [27]Dervieux T, Wessels JA, van der Straaten T, Penrod N, Moore JH, Guchelaar HJ, Kremer JM: Gene-gene interactions in folate and adenosine biosynthesis pathways affect methotrexate efficacy and tolerability in rheumatoid arthritis. Pharmacogenet Genomics 2009, 19(12):935-944.
  • [28]Dervieux T, Wessels JA, Kremer JM, Padyukov L, Seddighzadeh M, Saevarsdottir S, van Vollenhoven RF, Klareskog L, Huizinga TW, Guchelaar HJ: Patterns of interaction between genetic and nongenetic attributes and methotrexate efficacy in rheumatoid arthritis. Pharmacogenet Genomics 2012, 22(1):1-9.
  • [29]Hotelling H, Pabst MR: Rank correlation and tests of significance involving no assumption of normality. Annals of Mathematical Statistics 1936, 7:29-43.
  • [30]Minhajuddin ATM, Harris IR, Schucany WR: Simulating multivariate distributions with specific correlations. Journal of Statistical Computation and Simulation 2004, 74(8):599-607.
  • [31]Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc Ser B 1995, 57:289-833.
  • [32]Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N: Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J Natl Cancer Inst 2004, 96(6):434-442.
  • [33]Lucke JF: A critique of the false-positive report probability. Genetic epidemiology 2009, 33(2):145-150.
  • [34]Moore JH, Gilbert JC, Tsai CT, Chiang FT, Holden T, Barney N, White BC: A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. J Theor Biol 2006, 241(2):252-261.
  • [35]Poonkuzhali B, Lamba J, Storm S, Sparreboom A, Thummel K, Watkins P, Schuetz E: Association of breast cancer resistance protein/ABCG2 phenotypes and novel promoter and intron 1 single nucleotide polymorphisms. Drug metab Dispos 2008, 36(4):780-795.
  文献评价指标  
  下载次数:41次 浏览次数:7次