BMC Bioinformatics | |
Iterative rank-order normalization of gene expression microarray data | |
Eric A Welsh1  Steven A Eschrich1  Anders E Berglund1  David A Fenstermacher1  | |
[1] H. Lee Moffitt Cancer Center and Research Institute, University of South Florida, Tampa, FL, 33612, USA | |
关键词: GeneChip; Affymetrix; Normalization; Expression; Microarray; | |
Others : 1087882 DOI : 10.1186/1471-2105-14-153 |
|
received in 2012-11-28, accepted in 2013-04-29, 发布年份 2013 | |
【 摘 要 】
Background
Many gene expression normalization algorithms exist for Affymetrix GeneChip microarrays. The most popular of these is RMA, primarily due to the precision and low noise produced during the process. A significant strength of this and similar approaches is the use of the entire set of arrays during both normalization and model-based estimation of signal. However, this leads to differing estimates of expression based on the starting set of arrays, and estimates can change when a single, additional chip is added to the set. Additionally, outlier chips can impact the signals of other arrays, and can themselves be skewed by the majority of the population.
Results
We developed an approach, termed IRON, which uses the best-performing techniques from each of several popular processing methods while retaining the ability to incrementally renormalize data without altering previously normalized expression. This combination of approaches results in a method that performs comparably to existing approaches on artificial benchmark datasets (i.e. spike-in) and demonstrates promising improvements in segregating true signals within biologically complex experiments.
Conclusions
By combining approaches from existing normalization techniques, the IRON method offers several advantages. First, IRON normalization occurs pair-wise, thereby avoiding the need for all chips to be normalized together, which can be important for large data analyses. Secondly, the technique does not require similarity in signal distribution across chips for normalization, which can be important for maintaining biologically relevant differences in a heterogeneous background. Lastly, IRON introduces fewer post-processing artifacts, particularly in data whose behavior violates common assumptions. Thus, the IRON method provides a practical solution to common needs of expression analysis. A software implementation of IRON is available at [http://gene.moffitt.org/libaffy/ webcite].
【 授权许可】
2013 Welsh et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150117053440150.pdf | 2393KB | download | |
Figure 5. | 52KB | Image | download |
Figure 4. | 69KB | Image | download |
Figure 3. | 82KB | Image | download |
Figure 2. | 101KB | Image | download |
Figure 1. | 51KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
【 参考文献 】
- [1]Li C, Rabinovic A: Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 2007, 8(1):118-127.
- [2]Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 2002, 30(1):207.
- [3]Parkinson H, Sarkans U, Kolesnikov N, Abeygunawardena N, Burdett T, Dylag M, Emam I, Farne A, Hastings E, Holloway E: ArrayExpress update—an archive of microarray and high-throughput sequencing-based functional genomics experiments. Nucleic Acids Res 2011, 39(suppl 1):D1002.
- [4]Irizarry RA, Hobbs B, Collin F, Beazer‒Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003, 4(2):249.
- [5]Affymetrix: Statistical algorithms description document. Santa Clara, CA: Affymetrix Inc; 2002. [Technical paper]
- [6]Li C, Wong WH: Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biol 2001, 2(8):research0032.
- [7]Choe SE, Boutros M, Michelson AM, Church GM, Halfon MS: Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset. Genome Biol 2005, 6(2):R16. BioMed Central Full Text
- [8]Irizarry RA, Wu Z, Jaffee HA: Comparison of Affymetrix GeneChip expression measures. Bioinformatics (Oxford, England) 2006, 22(7):789-794.
- [9]Giorgi F, Bolger A, Lohse M, Usadel B: Algorithm-driven Artifacts in median polish summarization of Microarray data. BMC Bioinforma 2010, 11(1):553. BioMed Central Full Text
- [10]Cope LM, Irizarry RA, Jaffee HA, Wu Z, Speed TP: A benchmark for Affymetrix GeneChip expression measures. Bioinformatics (Oxford, England) 2004, 20(3):323.
- [11]Calza S, Valentini D, Pawitan Y: Normalization of oligonucleotide arrays based on the least-variant set of genes. BMC Bioinforma 2008, 9(1):140. BioMed Central Full Text
- [12]Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP: Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 2003, 31(4):e15-e15.
- [13]International Genomics Consortium -- expO. http://www.intgen.org/expo webcite
- [14]Fenstermacher DA, Wenham RM, Rollison DE, Dalton WS: Implementing personalized medicine in a cancer center. Cancer J 2011, 17(6):528.
- [15]Eschrich SA, Hoerter AM, Bloom GC, Fenstermacher DA: Tissue-specific RMA models to incrementally normalize Affymetrix GeneChip data. In Engineering in Medicine and Biology Society, 2008 EMBS 2008 30th Annual International Conference of the IEEE: 2008 . IEEE; 2008:2419-2422.
- [16]Katz S, Irizarry R, Lin X, Tripputi M, Porter M: A summarization approach for Affymetrix GeneChip data using a reference training set from a large, biologically diverse database. BMC Bioinforma 2006, 7(1):464. BioMed Central Full Text
- [17]Eschrich SA, Hoerter AM: Libaffy: software for processing Affymetrix (R) GeneChip (R) data. Bioinformatics (Oxford, England) 2007, 23(12):1562-1564.
- [18]Cleveland WS: Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc 1979, 74(368):829-836.
- [19]Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, Müller M: pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinforma 2011, 12(1):77. BioMed Central Full Text
- [20]Affycomp III. http://affycomp.biostat.jhsph.edu webcite