期刊论文详细信息
BMC Bioinformatics
Gene-set distance analysis (GSDA): a powerful tool for gene-set association analysis
Xueyuan Cao1  Stan Pounds2 
[1] Department of Acute and Tertiary Care, University of Tennessee Health Science Center, 38163, Memphis, USA;Department of Biostatistics, St Jude Children’s Research Hospital, 38105, Memphis, USA;
关键词: Gene profiling;    Gene set;    Distance correlation;   
DOI  :  10.1186/s12859-021-04110-x
来源: Springer
PDF
【 摘 要 】

BackgroundIdentifying sets of related genes (gene sets) that are empirically associated with a treatment or phenotype often yields valuable biological insights. Several methods effectively identify gene sets in which individual genes have simple monotonic relationships with categorical, quantitative, or censored event-time variables. Some distance-based methods, such as distance correlations, may detect complex non-monotone associations of a gene-set with a quantitative variable that elude other methods. However, the distance correlations have yet to be generalized to associate gene-sets with categorical and censored event-time endpoints. Also, there is a need to determine which genes empirically drive the significance of an association of a gene set with an endpoint.ResultsWe develop gene-set distance analysis (GSDA) by generalizing distance correlations to evaluate the association of a gene set with categorical and censored event-time variables. We also develop a backward elimination procedure to identify a subset of genes that empirically drive significant associations. In simulation studies, GSDA more effectively identified complex non-monotone gene-set associations than did six other published methods. In the analysis of a pediatric acute myeloid leukemia (AML) data set, GSDA was the only method to discover that event-free survival (EFS) was associated with the 56-gene AML pathway gene-set, narrow that result down to 5 genes, and confirm the association of those 5 genes with EFS in a separate validation cohort. These results indicate that GSDA effectively identifies and characterizes complex non-monotonic gene-set associations that are missed by other methods.ConclusionGSDA is a powerful and flexible method to detect gene-set association with categorical, quantitative, or censored event-time variables, especially to detect complex non-monotonic gene-set associations. Available at https://CRAN.R-project.org/package=GSDA.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202107032956770ZK.pdf 1799KB PDF download
  文献评价指标  
  下载次数:13次 浏览次数:13次