期刊论文详细信息
BMC Bioinformatics
Confident difference criterion: a new Bayesian differentially expressed gene selection algorithm with applications
Fang Yu4  Ming-Hui Chen2  Lynn Kuo2  Heather Talbott1  John S. Davis3 
[1] Department of Biochemistry and Molecular Biology and Department of Obstetrics and Gynecology, University of Nebraska Medical Center, Omaha 68198-5870, NE, USA
[2] Department of Statistics, University of Connecticut, Storrs 06269-4120, CT, USA
[3] VA Nebraska-Western Iowa Health Care System and Department of Obstetrics and Gynecology, University of Nebraska Medical Center, Omaha 68198-3255, NE, USA
[4] Department of Biostatistics, University of Nebraska Medical Center, Omaha 68198-4350, NE, USA
关键词: Next-generation sequencing;    Microarray;    Differential expression;    Bayesian;   
Others  :  1230256
DOI  :  10.1186/s12859-015-0664-3
 received in 2014-10-24, accepted in 2015-07-07,  发布年份 2015
【 摘 要 】

Background

Recently, the Bayesian method becomes more popular for analyzing high dimensional gene expression data as it allows us to borrow information across different genes and provides powerful estimators for evaluating gene expression levels. It is crucial to develop a simple but efficient gene selection algorithm for detecting differentially expressed (DE) genes based on the Bayesian estimators.

Results

In this paper, by extending the two-criterion idea of Chen et al. (Chen M-H, Ibrahim JG, Chi Y-Y. A new class of mixture models for differential gene expression in DNA microarray data. J Stat Plan Inference. 2008;138:387–404), we propose two new gene selection algorithms for general Bayesian models and name these new methods as the confident difference criterion methods. One is based on the standardized differences between two mean expression values among genes; the other adds the differences between two variances to it. The proposed confident difference criterion methods first evaluate the posterior probability of a gene having different gene expressions between competitive samples and then declare a gene to be DE if the posterior probability is large. The theoretical connection between the proposed first method based on the means and the Bayes factor approach proposed by Yu et al. (Yu F, Chen M-H, Kuo L. Detecting differentially expressed genes using alibrated Bayes factors. Statistica Sinica. 2008;18:783–802) is established under the normal-normal-model with equal variances between two samples. The empirical performance of the proposed methods is examined and compared to those of several existing methods via several simulations. The results from these simulation studies show that the proposed confident difference criterion methods outperform the existing methods when comparing gene expressions across different conditions for both microarray studies and sequence-based high-throughput studies. A real dataset is used to further demonstrate the proposed methodology. In the real data application, the confident difference criterion methods successfully identified more clinically important DE genes than the other methods.

Conclusion

The confident difference criterion method proposed in this paper provides a new efficient approach for both microarray studies and sequence-based high-throughput studies to identify differentially expressed genes.

【 授权许可】

   
2015 Yu et al.; licensee BioMed Central.

附件列表
Files Size Format View
Fig. 2. 21KB Image download
Fig. 1. 30KB Image download
Fig. 2. 21KB Image download
Figure 10. 149KB Image download
【 图 表 】

Figure 10.

Fig. 2.

Fig. 1.

Fig. 2.

【 参考文献 】
  • [1]Atli MO, Bender RW, Mehta V, Bastos MR, Luo W, Vezina CM et al.. Patterns of gene expression in the bovine corpus luteum following repeated intrauterine infusions of low doses of prostaglandin F 2α. Biol Reprod. 2012; 86(4):130.
  • [2]Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010; 11:R106. BioMed Central Full Text
  • [3]Auer PL, Doerge RW. A two-stage poisson model for testing RNA-Seq data. Stat Appl Genet Mol Biol. 2011; 10:1-26.
  • [4]Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG et al.. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008; 456(7218):53-9.
  • [5]Bishop CV, Bogan RL, Hennebold JD, Stouffer RL. Analysis of microarray data from the macaque corpus luteum; the search for common themes in primate luteal regression. Mol Hum Reprod. 2011; 17(3):143-51.
  • [6]Chen M-H, Ibrahim JG, Chi Y-Y. A new class of mixture models for differential gene expression in DNA microarray data. J Stat Plan Inference. 2008; 138:387-404.
  • [7]Dudroit S, Yang YH, Callow MJ, Speed TP. Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica Sinica. 2002; 12:111-39.
  • [8]Di Y, Schafer DW, Cumbie JS, Chang JH. The NBP negative binomial model for assessing differential gene expression from RNA-Seq. Stat Appl Genet Mol Biol. 2011; 10(1):1-28.
  • [9]Galväo AM, Ferreira-Dias G, Skarzynski DJ. Cytokines and angiogenesis in the corpus luteum. Mediators Inflamm. 2013; 2013:420186.
  • [10]Hardcastle TJ, baySeq KellyKA. Empirical Bayesian analysis of patterns of differential expression in count data. BMC Bioinformatics. 2010; 11:422-35. BioMed Central Full Text
  • [11]Hou X, Arvisais EW, Jiang C, Chen DB, Roy SK, Pate JL et al.. Prostaglandin F2 α stimulates the expression and secretion of transforming growth factor B1 via induction of the early growth response 1 gene (EGR1) in the bovine corpus luteum. Mol Endocrinol. 2008; 22(2):403-414.
  • [12]Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U et al.. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003; 4(2):249-64.
  • [13]Ibrahim JG, Chen M-H, Gray RJ. Bayesian models for gene expression with DNA microarray data. J Am Stat Assoc. 2002; 97:88-99.
  • [14]Kendziorski CM, Newton MA, Lan H, Gould MN. On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles. Stat Med. 2003; 22:3899-914.
  • [15]Kuo L, Yu F, Zhao Y. Statistical methods for identifying differentially expressed genes in replicated experiments: A review. Statistical Advances in the Biomedical Sciences: Clinical Trials, Epidemiology, Survival Analysis, and Bioinformatics. Biswas A, Data S, Fine J, Segal M, editors. Wiley-Interscience, Hoboken, NJ; 2008.
  • [16]Kvam VM, Liu P, Si Y. A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data. Am J Bot. 2012; 99(2):248-56.
  • [17]Leng N, Dawson JA, Stewart RM, Ruotti V, Rissman A, Smits B et al.. EBseq: An empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics. 2013; 29(8):1035-43.
  • [18]Ley TJ, Mardis ER, Ding L, Fulton B, McLellan MD, Chen K et al.. DNA sequencing of a cytogenetically normal acute myeloid leukaemia genome. Nature. 2008; 456(7218):66-72.
  • [19]Li J, Tibshirani R. Finding consistent patterns: a nonparametric approach for identifying differential expression in RNA-seq data. Stat Methods Med Res. 2013; 22:519-36.
  • [20]Lu J, Tomfohr JK, Kepler TB. Identifying differential expression in multiple SAGE libraries: an overdispersed log-linear model approach. BMC Bioinformatics. 2005; 6:165. BioMed Central Full Text
  • [21]Maroni D, Davis JS. TGFB1 disrupts the angiogenic potential of microvascular endothelial cells of the corpus luteum. J Cell Sci. 2012; 124(14):2501-510.
  • [22]Mondal M, Schilling B, Folger J, Steibel JP, Buchnick H, Zalman Y et al.. Deciphering the luteal transcriptome: potential mechanisms mediating stage-specific luteolytic response of the corpus luteum to prostaglandin F2α. Physiol Genomics. 2011; 43(8):447-56.
  • [23]Newton MA, Noueiry A, Sarkar D, Ahlquist P. Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics. 2004; 5:155-76.
  • [24]Okuda K, Sakumoto R. Multiple roles of TNF super family members in corpus luteum function. Reprod Biol Endocrinol. 2003; 1:95. BioMed Central Full Text
  • [25]Pan W. A comparative review of statistical methods for discovering differentially expressed genes in replicated microarray experiments. Bioinformatics. 2002; 18:546-54.
  • [26]Robinson MD, McCarthy DJ. Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010; 26:139-40.
  • [27]Romero JJ, Antoniazzi AQ, Smirnova NP, Webb BT, Yu F, Davis JS et al.. Pregnancy-associated genes contribute to antiluteolytic mechanisms in ovine corpus luteum. Physiol Genomics. 2013; 45(22):1095-1108.
  • [28]Smyth GK. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004; 3(1):Article 3.
  • [29]Soneson C, Delorenzi M. A comparison of methods for differential expression analysis of RNA-Seq data. BMC Bioinformatics. 2013; 14:91. BioMed Central Full Text
  • [30]Storey JD. A direct approach to false discovery rates. J R Stat Soc Ser B. 2002; 64:479-98.
  • [31]Tadesse MG, Ibrahim JG, Vannucci M, Gentleman R. Wavelet thresholding with Bayesian false discovery rate control. Biometrics. 2005; 61:25-35.
  • [32]Tarazona S, García-Alcalde F, Dopazo J, Ferrer A, Conesa A. Differential expression in RNA-Seq: a matter of depth. Genome Res. 2011; 21:2213-223.
  • [33]Tusher VG, Ti bshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A. 2011; 98:5116-121.
  • [34]Wang J, Wang W, Li R, Li Y, Tian G, Goodman L et al.. The diploid genome sequence of an Asian individual. Nature. 2008; 456:60-65.
  • [35]Wilson EB, Hilferty MM. The distribution of chi-square. Proc Natl Acad Sci U S A. 1931; 17:684-88.
  • [36]Yu F, Chen M-H, Kuo L. Detecting differentially expressed genes using calibrated Bayes factors. Statistica Sinica. 2008; 18:783-802.
  • [37]Zalman Y, Klipper E, Farberov S, Mondal M, Wee G, Folger JK. Regulation of Angiogenesis-Related Prostaglandin F2alpha-Induced Genes in the Bovine Corpus Luteum. Biology of Reproduction. 2012; 86(3):92.
  文献评价指标  
  下载次数:33次 浏览次数:37次