期刊论文详细信息
BMC Bioinformatics
CorrelaGenes: a new tool for the interpretation of the human transcriptome
Software
Lucia Sacchi1  Sergio Rovida2  Gianni Sacchi2  Francesca Calvi2  Silvia Bione3  Antonella Lisa3  Paolo Cremaschi3  Alessandra Montecucco3  Giuseppe Biamonti3 
[1] Dipartimento di Ingegneria Industriale e dell'Informazione, University of Pavia, 27100, Pavia, Italy;Institute of Applied Mathematics and Information Technology "Enrico Magenes", National Research Council, 27100, Pavia, Italy;Institute of Molecular Genetics, National Research Council, 27100, Pavia, Italy;
关键词: Gene Expression Omnibus;    Association Rule Mining;    Enrichment Score;    Public Repository;    Functional Annotation Cluster;   
DOI  :  10.1186/1471-2105-15-S1-S6
来源: Springer
PDF
【 摘 要 】

BackgroundThe amount of gene expression data available in public repositories has grown exponentially in the last years, now requiring new data mining tools to transform them in information easily accessible to biologists.ResultsBy exploiting expression data publicly available in the Gene Expression Omnibus (GEO) database, we developed a new bioinformatics tool aimed at the identification of genes whose expression appeared simultaneously altered in different experimental conditions, thus suggesting co-regulation or coordinated action in the same biological process. To accomplish this task, we used the 978 human GEO Curated DataSets and we manually performed the selection of 2,109 pair-wise comparisons based on their biological rationale. The lists of differentially expressed genes, obtained from the selected comparisons, were stored in a PostgreSQL database and used as data source for the CorrelaGenes tool. Our application uses a customized Association Rule Mining (ARM) algorithm to identify sets of genes showing expression profiles correlated with a gene of interest. The significance of the correlation is measured coupling the Lift, a well-known standard ARM index, and the χ2 p value. The manually curated selection of the comparisons and the developed algorithm constitute a new approach in the field of gene expression profiling studies. Simulation performed on 100 randomly selected target genes allowed us to evaluate the efficiency of the procedure and to obtain preliminary data demonstrating the consistency of the results.ConclusionsThe preliminary results of the simulation showed how CorrelaGenes could contribute to the characterization of molecular pathways and biological processes integrating data obtained from other applications and available in public repositories.

【 授权许可】

Unknown   
© Cremaschi et al.; licensee BioMed Central Ltd. 2014. This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

【 预 览 】
附件列表
Files Size Format View
RO202311102259652ZK.pdf 2039KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  文献评价指标  
  下载次数:1次 浏览次数:0次