期刊论文详细信息
BMC Bioinformatics
RedundancyMiner: De-replication of redundant GO categories in microarray and proteomics analysis
Software
Hongfang Liu1  Vladimir L Larionov1  Vinodh N Rajapakse2  John N Weinstein3  William Reinhold4  Yves G Pommier4  Barry R Zeeberg4  Robert F Bonner5  Martin Ehler5  Brian P Brooks6  Jacob D Brown6  Ari B Kahn7 
[1] Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, 4000 Reservoir Road, NW, 20007, Washington, DC, USA;Department of Mathematics, University of Maryland, 20742, College Park, MD, USA;Departments of Bioinformatics and Computational Biology and Systems Biology, M.D. Anderson Cancer Center, 77030, Houston, TX, USA;Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH, Room 5068, Building 37, 37 Convent Drive, 20892, Bethesda, MD, USA;National Institutes of Health, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Section on Medical Biophysics, 20892, Bethesda, MD, USA;National Institutes of Health, National Eye Institute, Ophthalmic Genetics and Visual Function Branch, 20892, Bethesda, MD, USA;SRA International, Inc., Fairfax, VA, USA;
关键词: Gene Ontology;    Nonnegative Matrix Factorization;    Retinal Development;    Nominal Number;    Redundancy Problem;   
DOI  :  10.1186/1471-2105-12-52
 received in 2010-06-15, accepted in 2011-02-10,  发布年份 2011
来源: Springer
PDF
【 摘 要 】

BackgroundThe Gene Ontology (GO) Consortium organizes genes into hierarchical categories based on biological process, molecular function and subcellular localization. Tools such as GoMiner can leverage GO to perform ontological analysis of microarray and proteomics studies, typically generating a list of significant functional categories. Two or more of the categories are often redundant, in the sense that identical or nearly-identical sets of genes map to the categories. The redundancy might typically inflate the report of significant categories by a factor of three-fold, create an illusion of an overly long list of significant categories, and obscure the relevant biological interpretation.ResultsWe now introduce a new resource, RedundancyMiner, that de-replicates the redundant and nearly-redundant GO categories that had been determined by first running GoMiner. The main algorithm of RedundancyMiner, MultiClust, performs a novel form of cluster analysis in which a GO category might belong to several category clusters. Each category cluster follows a "complete linkage" paradigm. The metric is a similarity measure that captures the overlap in gene mapping between pairs of categories.ConclusionsRedundancyMiner effectively eliminated redundancies from a set of GO categories. For illustration, we have applied it to the clarification of the results arising from two current studies: (1) assessment of the gene expression profiles obtained by laser capture microdissection (LCM) of serial cryosections of the retina at the site of final optic fissure closure in the mouse embryos at specific embryonic stages, and (2) analysis of a conceptual data set obtained by examining a list of genes deemed to be "kinetochore" genes.

【 授权许可】

CC BY   
© Zeeberg et al; licensee BioMed Central Ltd. 2011

【 预 览 】
附件列表
Files Size Format View
RO202311107467572ZK.pdf 1089KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  文献评价指标  
  下载次数:1次 浏览次数:0次