期刊论文详细信息
BMC Bioinformatics
Evaluation of tree-based statistical learning methods for constructing genetic risk scores
Claudia Wigmann1  Sara Kress1  Tamara Schikowski1  Michael Lau2  Holger Schwender2 
[1]IUF – Leibniz Research Institute for Environmental Medicine
[2]Mathematical Institute, Heinrich Heine University
关键词: Polygenic risk scores;    Epistasis;    Statistical learning;    Random forests;    Variable selection;    Logic regression;   
DOI  :  10.1186/s12859-022-04634-w
来源: DOAJ
【 摘 要 】
Abstract Background Genetic risk scores (GRS) summarize genetic features such as single nucleotide polymorphisms (SNPs) in a single statistic with respect to a given trait. So far, GRS are typically built using generalized linear models or regularized extensions. However, these linear methods are usually not able to incorporate gene-gene interactions or non-linear SNP-response relationships. Tree-based statistical learning methods such as random forests and logic regression may be an alternative to such regularized-regression-based methods and are investigated in this article. Moreover, we consider modifications of random forests and logic regression for the construction of GRS. Results In an extensive simulation study and an application to a real data set from a German cohort study, we show that both tree-based approaches can outperform elastic net when constructing GRS for binary traits. Especially a modification of logic regression called logic bagging could induce comparatively high predictive power as measured by the area under the curve and the statistical power. Even when considering no epistatic interaction effects but only marginal genetic effects, the regularized regression method lead in most cases to inferior results. Conclusions When constructing GRS, we recommend taking random forests and logic bagging into account, in particular, if it can be assumed that possibly unknown epistasis between SNPs is present. To develop the best possible prediction models, extensive joint hyperparameter optimizations should be conducted.
【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次