期刊论文

【摘要】

BackgroundStatistical analysis of genome-wide microarrays can result in many thousands of identical statistical tests being performed as each probe is tested for an association with a phenotype of interest. If there were no association between any of the probes and the phenotype, the distribution of P values obtained from statistical tests would resemble a Uniform distribution. If a selection of probes were significantly associated with the phenotype we would expect to observe P values for these probes of less than the designated significance level, alpha, resulting in more P values of less than alpha than expected by chance.ResultsIn data from a whole genome methylation promoter array we unexpectedly observed P value distributions where there were fewer P values less than alpha than would be expected by chance. Our data suggest that a possible reason for this is a violation of the statistical assumptions required for these tests arising from heteroskedasticity. A simple but statistically sound remedy (a heteroskedasticity–consistent covariance matrix estimator to calculate standard errors of regression coefficients that are robust to heteroskedasticity) rectified this violation and resulted in meaningful P value distributions.ConclusionsThe statistical analysis of ‘omics data requires careful handling, especially in the choice of statistical test. To obtain meaningful results it is essential that the assumptions behind these tests are carefully examined and any violations rectified where possible, or a more appropriate statistical test chosen.

【授权许可】

CC BY
© Barton et al.; licensee BioMed Central Ltd. 2013

【预览】

附件列表
Files	Size	Format	View
RO202311098393330ZK.pdf	249KB	PDF	download

【参考文献】

[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]

BMC Genomics
Correction of unexpected distributions of P values from analysis of whole genome arrays by rectifying violation of statistical assumptions
Methodology Article
Karen A Lillycrop¹ Sheila J Barton² Hazel M Inskip² Sarah R Crozier² Keith M Godfrey³
[1] Human Development and Health Academic Unit, University of Southampton, Southampton, UK;School of Biological Sciences, University of Southampton, Southampton, UK;MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, UK;MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton, UK;NIHR Southampton Biomedical Research Centre, University of Southampton and University Hospital Southampton NHS Foundation Trust, Southampton, UK;Human Development and Health Academic Unit, University of Southampton, Southampton, UK;
关键词: P values; Distributions; Statistical analysis; Statistical assumptions; Whole genome methylation promoter arrays; Epigenome;
DOI : 10.1186/1471-2164-14-161
received in 2012-07-26, accepted in 2013-03-06, 发布年份 2013
来源: Springer
PDF


	文献评价指标
	下载次数：10次	浏览次数：0次

【 摘 要 】

【 授权许可】

【 预 览 】

【 参考文献 】

【摘要】

【授权许可】

【预览】

【参考文献】