期刊论文详细信息
BMC Bioinformatics
Appearance frequency modulated gene set enrichment testing
Research Article
Maureen A Sartor1  Jun Ma2  HV Jagadish3 
[1] Center for Computational Medicine and Biology, University of Michigan, Ann Arbor, MI, USA;Department of EECS, University of Michigan, Ann Arbor, MI, USA;Department of EECS, University of Michigan, Ann Arbor, MI, USA;Center for Computational Medicine and Biology, University of Michigan, Ann Arbor, MI, USA;
关键词: KEGG Pathway;    Enrichment Score;    Inverse Document Frequency;    mTOR Signaling Pathway;    Appearance Frequency;   
DOI  :  10.1186/1471-2105-12-81
 received in 2010-07-05, accepted in 2011-03-20,  发布年份 2011
来源: Springer
PDF
【 摘 要 】

BackgroundGene set enrichment testing has helped bridge the gap from an individual gene to a systems biology interpretation of microarray data. Although gene sets are defined a priori based on biological knowledge, current methods for gene set enrichment testing treat all genes equal. It is well-known that some genes, such as those responsible for housekeeping functions, appear in many pathways, whereas other genes are more specialized and play a unique role in a single pathway. Drawing inspiration from the field of information retrieval, we have developed and present here an approach to incorporate gene appearance frequency (in KEGG pathways) into two current methods, Gene Set Enrichment Analysis (GSEA) and logistic regression-based LRpath framework, to generate more reproducible and biologically meaningful results.ResultsTwo breast cancer microarray datasets were analyzed to identify gene sets differentially expressed between histological grade 1 and 3 breast cancer. The correlation of Normalized Enrichment Scores (NES) between gene sets, generated by the original GSEA and GSEA with the appearance frequency of genes incorporated (GSEA-AF), was compared. GSEA-AF resulted in higher correlation between experiments and more overlapping top gene sets. Several cancer related gene sets achieved higher NES in GSEA-AF as well. The same datasets were also analyzed by LRpath and LRpath with the appearance frequency of genes incorporated (LRpath-AF). Two well-studied lung cancer datasets were also analyzed in the same manner to demonstrate the validity of the method, and similar results were obtained.ConclusionsWe introduce an alternative way to integrate KEGG PATHWAY information into gene set enrichment testing. The performance of GSEA and LRpath can be enhanced with the integration of appearance frequency of genes. We conclude that, generally, gene set analysis methods with the integration of information from KEGG PATHWAY performs better both statistically and biologically.

【 授权许可】

CC BY   
© Ma et al; licensee BioMed Central Ltd. 2011

【 预 览 】
附件列表
Files Size Format View
RO202311092439853ZK.pdf 1817KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  文献评价指标  
  下载次数:5次 浏览次数:0次