| Genome Biology | |
| Leveraging biological and statistical covariates improves the detection power in epigenome-wide association testing | |
| Liwen Wang1  Xianyang Zhang2  Jun Chen3  Yue Yu4  Shulin Ruan5  Jinyan Huang5  Zhiyin An5  Ling Bai5  Bowen Cui5  Liang Wu5  | |
| [1] Department of General Surgery, Rui-Jin Hospital, Shanghai Jiao Tong University;Department of Statistics, Texas A&M University;Division of Biomedical Statistics and Informatics, Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic;Division of Digital Health Sciences, Mayo Clinic;State Key Laboratory of Medical Genomics, Shanghai Institute of Hematology, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Jiao Tong University; | |
| 关键词: False discovery rate; EWAS; Multiple hypothesis testing; Covariate; | |
| DOI : 10.1186/s13059-020-02001-7 | |
| 来源: DOAJ | |
【 摘 要 】
Abstract Background Epigenome-wide association studies (EWAS), which seek the association between epigenetic marks and an outcome or exposure, involve multiple hypothesis testing. False discovery rate (FDR) control has been widely used for multiple testing correction. However, traditional FDR control methods do not use auxiliary covariates, and they could be less powerful if the covariates could inform the likelihood of the null hypothesis. Recently, many covariate-adaptive FDR control methods have been developed, but application of these methods to EWAS data has not yet been explored. It is not clear whether these methods can significantly improve detection power, and if so, which covariates are more relevant for EWAS data. Results In this study, we evaluate the performance of five covariate-adaptive FDR control methods with EWAS-related covariates using simulated as well as real EWAS datasets. We develop an omnibus test to assess the informativeness of the covariates. We find that statistical covariates are generally more informative than biological covariates, and the covariates of methylation mean and variance are almost universally informative. In contrast, the informativeness of biological covariates depends on specific datasets. We show that the independent hypothesis weighting (IHW) and covariate adaptive multiple testing (CAMT) method are overall more powerful, especially for sparse signals, and could improve the detection power by a median of 25% and 68% on real datasets, compared to the ST procedure. We further validate the findings in various biological contexts. Conclusions Covariate-adaptive FDR control methods with informative covariates can significantly increase the detection power for EWAS. For sparse signals, IHW and CAMT are recommended.
【 授权许可】
Unknown