期刊论文详细信息
BMC Research Notes
Comparison of alternative mixture model methods to analyze bacterial CGH experiments with multi-genome arrays
Francisco Rodrigues Pinto3  Marília Antunes2  Mário Ramirez1  Cláudia Elvas Suissas3  Liliana Sofia Cardoso1 
[1] Instituto de Microbiologia, Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal;Centro de Estatística e Aplicações, DEIO, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal;Centro de Química e Bioquímica, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal
关键词: Data analysis;    Comparative genomic hybridization;    Microarray;   
Others  :  1134278
DOI  :  10.1186/1756-0500-7-148
 received in 2013-03-22, accepted in 2014-03-10,  发布年份 2014
PDF
【 摘 要 】

Background

Microarray-based comparative genomic hybridization (aCGH) is used for rapid comparison of genomes of different bacterial strains. The purpose is to evaluate the distribution of genes from sequenced bacterial strains (control) among unsequenced strains (test). We previously compared the use of single strain versus multiple strain control with arrays covering multiple genomes. The conclusion was that a multiple strain control promoted a better separation of signals between present and absent genes.

Findings

We now extend our previous study by applying the Expectation-Maximization (EM) algorithm to fit a mixture model to the signal distribution in order to classify each gene as present or absent and by comparing different methods for analyzing aCGH data, using combinations of different control strain choices, two different statistical mixture models, with or without normalization, with or without logarithm transformation and with test-over-control or inverse signal ratio calculation. We also assessed the impact of replication on classification accuracy. Higher values of accuracy have been achieved using the ratio of control-over-test intensities, without logarithmic transformation and with a strain mix control. Normalization and the type of mixture model fitted by the EM algorithm did not have a significant impact on classification accuracy. Similarly, using the average of replicate arrays to perform the classification does not significantly improve the results.

Conclusions

Our work provides a guiding benchmark comparison of alternative methods to analyze aCGH results that can impact on the analysis of currently ongoing comparative genomic projects or in the re-analysis of published studies.

【 授权许可】

   
2014 Cardoso et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150305141716600.pdf 437KB PDF download
Figure 4. 27KB Image download
Figure 3. 42KB Image download
Figure 2. 41KB Image download
Figure 1. 40KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

【 参考文献 】
  • [1]Pinto FR, Aguiar SI, Melo-Cristino J, Ramirez M: Optimal control and analysis of two-color genomotyping experiments using bacterial multistrain arrays. BMC Genomics 2008, 9:230. BioMed Central Full Text
  • [2]Snipen L, Nyquist OL, Solheim M, Aakra A, Nes IF: Improved analysis of bacterial CGH data beyond the log-ratio paradigm. BMC Bioinformatics 2009, 10:91. BioMed Central Full Text
  • [3]van Hijum SA, Baerends RJ, Zomer AL, Karsens HA, Martin-Requena V, Trelles O, Kok J, Kuipers OP: Supervised lowess normalization of comparative genome hybridization data - application to lactococcal strain comparisons. BMC Bioinformatics 2008, 9:93. BioMed Central Full Text
  • [4]Kim CC, Joyce EA, Chan K, Falkow S: Improved analytical methods for microarray-based genome-composition analysis. Genome Biol 2002, 3:RESEARCH0065.
  • [5]Carpaij N, Fluit A, Lindsay J, Bonten M, Willems R: New methods to analyse microarray data that partially lack a reference signal. BMC Genomics 2009, 10:522. BioMed Central Full Text
  • [6]Taboada EN, Acedillo RR, Luebbert CC, Findlay WA, Nash JHE: A new approach for the analysis of bacterial microarray-based comparative genomic hybridization: insights from an empirical study. BMC Genomics 2005, 6:78. BioMed Central Full Text
  • [7]Carter B, Wu G, Woodward MJ, Anjum MF: A process for analysis of microarray comparative genomics hybridisation studies for bacterial genomes. BMC Genomics 2008, 9:53. BioMed Central Full Text
  • [8]Obert C, Sublett J, Kaushal D, Hinojosa E, Barton T, Tuomanen EI, Orihuela CJ: Identification of a candidate streptococcus pneumoniae core genome and Regions of diversity correlated with invasive pneumococcal disease. Infect Immun 2006, 74:4766-4777.
  • [9]Witney A, Marsden G, Holden M, Stabler R, Husain S, Vass J, Butcher P, Hinds J, Lindsay J: Design, validation, and application of a seven-strain staphylococcus aureus PCR product microarray for comparative genomics†. Appl Environ Microbiol 2005, 71:7504-7514.
  • [10]Hotopp JCD, Grifantini R, Kumar N, Tzeng YL, Fouts D, Frigimelica E, Draghi M, Giuliani MM, Rappuoli R, Stephens DS, Grandi G, Tettelin H: Comparative genomics of Neisseria meningitidis: core genome, islands of horizontal transfer and pathogen-specific genes. Microbiology 2006, 152:3733-3749.
  • [11]Lindsay J, Moore C, Day N, Peacock S, Witney A, Stabler R, Husain S, Butcher P, Hinds J: Microarrays reveal that each of the ten dominant lineages of staphylococcus aureus has a unique combination of surface-associated and regulatory genes†. J Bacteriol 2006, 188:669-676.
  • [12]Israel DA, Salama N, Arnold CN, Moss SF, Ando T, Wirth HP, Tham KT, Camorlinga M, Blaser MJ, Falkow S, Peek RM: Helicobacter pylori strain-specific differences in genetic content, identified by microarray, influence host inflammatory responses. J Clin Invest 2001, 107:611-620.
  • [13]Dean N, Raftery AE: Normal uniform mixture differential gene expression detection for cDNA microarrays. BMC Bioinformatics 2005, 6:173. BioMed Central Full Text
  • [14]Antunes M, Sousa L: Bayesian classification and non-Bayesian label estimation via EM algorithm to identify differentially expressed genes: a comparative study. Biom J 2008, 50:824-836.
  • [15]Feten G, Almoy T, Snipen L, Aakra A, Nyquist O, Aastveit A: Mixture models as a method to find present and divergent genes in comparative genomic hybridization studies on bacteria. Biom J 2007, 49:242-258.
  • [16]Lewis RA, Laing E, Allenby N, Bucca G, Brenner V, Harrison M, Kierzek AM, Smith CP: Metabolic and evolutionary insights into the closely-related species Streptomyces coelicolor and Streptomyces lividans deduced from high-resolution comparative genomic hybridization. BMC Genomics 2010, 11:682. BioMed Central Full Text
  • [17]Pritchard L, Liu H, Booth C, Douglas E, François P, Schrenzel J, Hedley PE, Birch PRJ, Toth IK: Microarray comparative genomic hybridisation analysis incorporating genomic organization, and application to enterobacterial plant pathogens. PLoS Comput Biol 2009, 5:e1000473.
  • [18]Harvey RM, Stroeher UH, Ogunniyi AD, Smith-Vaughan HC, Leach AJ, Paton JC: A variable region within the genome of streptococcus pneumoniae contributes to strain-strain variation in virulence. PLoS One 2011, 6:e19650.
  • [19]Aguado-Urda M, Lopez-Campos GH, Fernandez-Garayzabal JF, Martin-Sanchez F, Gibello A, Dominguez L, Blanco MM: Analysis of the genome content of Lactococcus garvieae by genomic interspecies microarray hybridization. BMC Microbiol 2010, 10:79. BioMed Central Full Text
  • [20]Janvilisri T, Scaria J, Thompson AD, Nicholson A, Limbago BM, Arroyo LG, Songer JG, Grohn YT, Chang Y-F: Microarray identification of clostridium difficile core components and divergent regions associated with host origin. J Bacteriol 2009, 191:3881-3891.
  • [21]Denapaite D, Brückner R, Nuhn M, Reichmann P, Henrich B, Maurer P, Schähle Y, Selbmann P, Zimmermann W, Wambutt R, Hakenbeck R: The genome of Streptococcus mitis B6–what is a commensal? PLoS One 2010, 5:e9426.
  • [22]Johnston C, Hinds J, Smith A, van der Linden M, van Eldere J, Mitchell TJ: Detection of large numbers of pneumococcal virulence genes in Streptococci of the Mitis group. J Clin Microbiol 2010, 48(8):2762-2769.
  文献评价指标  
  下载次数:8次 浏览次数:4次