期刊论文详细信息
BMC Bioinformatics
A two-step hierarchical hypothesis set testing framework, with applications to gene expression data on ordered categories
Yihan Li1  Debashis Ghosh1 
[1] Department of Statistics, Pennsylvania State University, University Park, State College, Pennsylvania 16802, USA
关键词: Dose response;    Time course;    Microarray;    Benjamini-Hochberg procedure;    Mixed-directional false discovery rate;    Overall false discovery rate;    Multiple testing;   
Others  :  818668
DOI  :  10.1186/1471-2105-15-108
 received in 2013-07-25, accepted in 2014-04-09,  发布年份 2014
PDF
【 摘 要 】

Background

In complex large-scale experiments, in addition to simultaneously considering a large number of features, multiple hypotheses are often being tested for each feature. This leads to a problem of multi-dimensional multiple testing. For example, in gene expression studies over ordered categories (such as time-course or dose-response experiments), interest is often in testing differential expression across several categories for each gene. In this paper, we consider a framework for testing multiple sets of hypothesis, which can be applied to a wide range of problems.

Results

We adopt the concept of the overall false discovery rate (OFDR) for controlling false discoveries on the hypothesis set level. Based on an existing procedure for identifying differentially expressed gene sets, we discuss a general two-step hierarchical hypothesis set testing procedure, which controls the overall false discovery rate under independence across hypothesis sets. In addition, we discuss the concept of the mixed-directional false discovery rate (mdFDR), and extend the general procedure to enable directional decisions for two-sided alternatives. We applied the framework to the case of microarray time-course/dose-response experiments, and proposed three procedures for testing differential expression and making multiple directional decisions for each gene. Simulation studies confirm the control of the OFDR and mdFDR by the proposed procedures under independence and positive correlations across genes. Simulation results also show that two of our new procedures achieve higher power than previous methods. Finally, the proposed methodology is applied to a microarray dose-response study, to identify 17 β-estradiol sensitive genes in breast cancer cells that are induced at low concentrations.

Conclusions

The framework we discuss provides a platform for multiple testing procedures covering situations involving two (or potentially more) sources of multiplicity. The framework is easy to use and adaptable to various practical settings that frequently occur in large-scale experiments. Procedures generated from the framework are shown to maintain control of the OFDR and mdFDR, quantities that are especially relevant in the case of multiple hypothesis set testing. The procedures work well in both simulations and real datasets, and are shown to have better power than existing methods.

【 授权许可】

   
2014 Li and Ghosh; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20140711132054442.pdf 475KB PDF download
Figure 2. 34KB Image download
Figure 1. 33KB Image download
【 图 表 】

Figure 1.

Figure 2.

【 参考文献 】
  • [1]Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B 1995, 57:289-300.
  • [2]Sun W, Wei Z: Multiple testing for pattern identification, with applications to microarray time-course experiments. J Am Stat Assoc 2011, 106:73-88.
  • [3]Guo W, Sarkar SK, Peddada SD: Controlling false discoveries in multidimensional directional decisions, with applications to gene expression data on ordered categories. Biometrics 2010, 66:485-492.
  • [4]Heller R, Manduchi E, Grant GR, J EW: A flexible two-stage procedure for identifying gene sets that are differentially expressed. Bioinformatics 2009, 25:1019-1025.
  • [5]Benjamini Y, Heller R: Screening for partial conjunction hypotheses. Biometrics 2008, 64:1215-1222.
  • [6]Coser KR, Chesnes J, Hur J, Ray S, Isselbacher KJ, Shioda T: Global analysis of ligand sensitivity of estrogen inducible and suppressible genes in MCF7/BUS breast cancer cells by DNA microarray. Proc Nat Acad Sci 2003, 100:13994-13999.
  • [7]Benjamini Y, Yekutieli D: The control of the false discovery rate in multiple testing under dependency. Ann Stat 2001, 29:1165-1188.
  • [8]Finner H: Stepwise multiple test procedures and control of directional errors. Ann Stat 1999, 27:274-289.
  • [9]Holm S: A simple sequentially rejective multiple test procedure. Scand J Stat 1979, 6:65-70.
  • [10]Hochberg Y: A sharper Bonferroni procedure for multiple tests of significance. Biometrika 1988, 75:800-802.
  • [11]Sarkar SK, Chang CK: The Simes method for multiple hypothesis testing with positively dependent test statistics. J Am Stat Assoc 1997, 92:1601-1608.
  • [12]Simes RJ: An improved Bonferroni procedure for multiple tests of significance. Biometrika 1986, 73:751-754.
  文献评价指标  
  下载次数:28次 浏览次数:15次