期刊论文详细信息
BMC Research Notes
Challenges of the information age: the impact of false discovery on pathway identification
Mary E Edgerton1  Srinivasa C Chekuri1  Colin J Rog1 
[1] M.D. Anderson Cancer Center, Department of Pathology, 1515 Holcombe Blvd, Houston, TX, 77030, USA
关键词: Cancer pathways;    Bioinformatics;    Networks;    Genes;    Pathways;    Databases;   
Others  :  1165170
DOI  :  10.1186/1756-0500-5-647
 received in 2012-09-11, accepted in 2012-11-16,  发布年份 2012
PDF
【 摘 要 】

Background

Pathways with members that have known relevance to a disease are used to support hypotheses generated from analyses of gene expression and proteomic studies. Using cancer as an example, the pitfalls of searching pathways databases as support for genes and proteins that could represent false discoveries are explored.

Findings

The frequency with which networks could be generated from 100 instances each of randomly selected five and ten genes sets as input to MetaCore, a commercial pathways database, was measured. A PubMed search enumerated cancer-related literature published for any gene in the networks. Using three, two, and one maximum intervening step between input genes to populate the network, networks were generated with frequencies of 97%, 77%, and 7% using ten gene sets and 73%, 27%, and 1% using five gene sets. PubMed reported an average of 4225 cancer-related articles per network gene.

Discussion

This can be attributed to the richly populated pathways databases and the interest in the molecular basis of cancer. As information sources become enriched, they are more likely to generate plausible mechanisms for false discoveries.

【 授权许可】

   
2012 Rog et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150416024616240.pdf 628KB PDF download
Figure 3. 56KB Image download
Figure 2. 50KB Image download
Figure 1. 21KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

【 参考文献 】
  • [1]Knudson AG Jr: Mutation and cancer: statistical study of retinoblastoma. Proc Natl Acad Sci USA 1971, 68(4):820-823.
  • [2]Fearon ER, Vogelstein B: A genetic model for colorectal tumorigenesis. Cell 1990, 61(5):759-767.
  • [3]Beerenwinkel N, Antal T, Dingli D, Traulsen A, Kinzler KW, Velculescu VE, Vogelstein B, Nowak MA: Genetic progression and the waiting time to cancer. PLoS Comput Biol 2007, 3(11):e225.
  • [4]Hartwell LH, Hopfield JJ, Leibler S, Murray AW: From molecular to modular cell biology. Nature 1999, 402(6761 Suppl):C47-C52.
  • [5]Vogelstein B, Kinzler KW: Cancer genes and the pathways they control. Nat Med 2004, 10(8):789-799.
  • [6]Ledford H: Big science: The cancer genome challenge. Nature 2010, 464(7291):972-974.
  • [7]Stead W, Searle J, Smith HFJ, Shortliffe E: Biomedical Informatics: changing what physicians need to know and how they learn. Acad Med 2011, 86(April):429-434.
  • [8]Elkins SNY, Bugrim A, Kirillow E, Nikolskaya T: Pathway mapping tools for analysis of high content data. Methods Mol Biol 2007, 356:319-350.
  • [9]Subramanian A, Kuehn H, Gould J, Tamayo P, Mesirov JP: GSEA-P: a desktop application for gene set enrichment analysis. Bioinformatics 2007, 23(23):3251-3253.
  • [10]Edgerton ME, Fisher DH, Tang L, Frey LJ, Chen Z: Data mining for gene networks relevant to poor prognosis in lung cancer via backward-chaining rule induction. Cancer Inform 2007, 3:93-114.
  • [11]Nikolsky Y, Ekins S, Nikolskaya T, Bugrim A: A novel method for generation of signature networks as biomarkers from complex high throughput data. Toxicol Lett 2005, 158(1):20-29.
  • [12]Lee JK, Williams PD, Cheon S: Data mining in genomics. Clin Lab Med 2008, 28(1):145-166. viii
  • [13]Nakashima A, Hirabayashi A, Ogawa H: Error correcting memorization learning for noisy training examples. Neural Netw 2001, 14(1):79-92.
  • [14]Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 2001, 98(9):5116-5121.
  • [15]Chaussabel D, Sher A: Mining microarray expression data by literature profiling. Genome Biol 2002, 3(10):RESEARCH0055.
  • [16]Yang Y, Adelstein SJ, Kassis AI: Integrated bioinformatics analysis for cancer target identification. Methods Mol Biol 2011, 719:527-545.
  • [17]Becker RA, Chambers JM, Wilks AR: The new S language: a programming environment for data analysis and graphics. Pacific Grove, Calif: Wadsworth & Brooks/Cole Advanced Books & Software; 1988.
  • [18]Ripley B: Stochastic simulation. New York: Wiley; 1987.
  文献评价指标  
  下载次数:30次 浏览次数:3次