期刊论文详细信息
BMC Bioinformatics
Evaluation of copy number variation detection for a SNP array platform
Hongyan Wang2  Li Jin1  Feng Zhang1  Shilin Li1  Renqian Du1  Xin Zhang1 
[1]State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, 220 Handan Road, Shanghai 200433, China
[2]Children’s Hospital of Fudan University, 399 Wanyuan Road, Shanghai 201102, China
关键词: PennCNV;    GTC;    dChip;    Birdsuite;    Success rate;    Reproducibility test;    Performance test;    Comparison;    Evaluation;    CGH;    CNV;   
Others  :  1087618
DOI  :  10.1186/1471-2105-15-50
 received in 2013-07-28, accepted in 2014-02-06,  发布年份 2014
PDF
【 摘 要 】

Background

Copy Number Variations (CNVs) are usually inferred from Single Nucleotide Polymorphism (SNP) arrays by use of some software packages based on given algorithms. However, there is no clear understanding of the performance of these software packages; it is therefore difficult to select one or several software packages for CNV detection based on the SNP array platform.

We selected four publicly available software packages designed for CNV calling from an Affymetrix SNP array, including Birdsuite, dChip, Genotyping Console (GTC) and PennCNV. The publicly available dataset generated by Array-based Comparative Genomic Hybridization (CGH), with a resolution of 24 million probes per sample, was considered to be the “gold standard”. Compared with the CGH-based dataset, the success rate, average stability rate, sensitivity, consistence and reproducibility of these four software packages were assessed compared with the “gold standard”. Specially, we also compared the efficiency of detecting CNVs simultaneously by two, three and all of the software packages with that by a single software package.

Results

Simply from the quantity of the detected CNVs, Birdsuite detected the most while GTC detected the least. We found that Birdsuite and dChip had obvious detecting bias. And GTC seemed to be inferior because of the least amount of CNVs it detected. Thereafter we investigated the detection consistency produced by one certain software package and the rest three software suits. We found that the consistency of dChip was the lowest while GTC was the highest. Compared with the CNVs detecting result of CGH, in the matching group, GTC called the most matching CNVs, PennCNV-Affy ranked second. In the non-overlapping group, GTC called the least CNVs. With regards to the reproducibility of CNV calling, larger CNVs were usually replicated better. PennCNV-Affy shows the best consistency while Birdsuite shows the poorest.

Conclusion

We found that PennCNV outperformed the other three packages in the sensitivity and specificity of CNV calling. Obviously, each calling method had its own limitations and advantages for different data analysis. Therefore, the optimized calling methods might be identified using multiple algorithms to evaluate the concordance and discordance of SNP array-based CNV calling.

【 授权许可】

   
2014 Zhang et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150117023504223.pdf 1652KB PDF download
Figure 6. 36KB Image download
Figure 5. 64KB Image download
Figure 4. 99KB Image download
Figure 3. 60KB Image download
Figure 2. 71KB Image download
Figure 1. 92KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

【 参考文献 】
  • [1]Zhang F, Gu W, Hurles ME, Lupski JR: Copy number variation in human health, disease, and evolution. Annu Rev Genom Hum Genet 2009, 10:451-481.
  • [2]Diskin SJ, Hou C, Glessner JT, Attiyeh EF, Laudenslager M, Bosse K, Cole K, Mosse YP, Wood A, Lynch JE, Pecor K, Diamond M, Winter C, Wang K, Kim C, Geiger EA, McGrady PW, Blakemore AI, London WB, Shaikh TH, Bradfield J, Grant SF, Li H, Devoto M, Rappaport ER, Hakonarson H, Maris JM: Copy number variation at 1q21.1 associated with neuroblastoma. Nature 2009, 459(7249):987-991.
  • [3]Sebat J, Lakshmi B, Malhotra D, Troge J, Lese-Martin C, Walsh T, Yamrom B, Yoon S, Krasnitz A, Kendall J, Leotta A, Pai D, Zhang R, Lee YH, Hicks J, Spence SJ, Lee AT, Puura K, Lehtimäki T, Ledbetter D, Gregersen PK, Bregman J, Sutcliffe JS, Jobanputra V, Chung W, Warburton D, King MC, Skuse D, Geschwind DH, Gilliam TC, et al.: Strong association of de novo copy number mutations with autism. Science 2007, 316(5823):445-449.
  • [4]Gonzalez E, Kulkarni H, Bolivar H, Mangano A, Sanchez R, Catano G, Nibbs RJ, Freedman BI, Quinones MP, Bamshad MJ, Murthy KK, Rovin BH, Bradley W, Clark RA, Anderson SA, O'connell RJ, Agan BK, Ahuja SS, Bologna R, Sen L, Dolan MJ, Ahuja SK: The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 2005, 307(5714):1434-1440.
  • [5]Kim J, Yim S, Jeong Y, Jung S, Xu H, Shin S, Chung Y: Comparison of normalization methods for defining copy number variation using whole-genome SNP genotyping data. Genomics Inf 2008, 6(4):231-234.
  • [6]Du R, Lu C, Jiang Z, Li S, Ma R, An H, Xu M, An Y, Xia Y, Jin L, Wang X, Zhang F: Efficient typing of copy number variations in a segmental duplication-mediated rearrangement hotspot using multiplex competitive amplification. J Hum Genet 2012, 57(8):545-551.
  • [7]Park H, Kim JI, Ju YS, Gokcumen O, Mills RE, Kim S, Lee S, Suh D, Hong D, Kang HP, Yoo YJ, Shin JY, Kim HJ, Yavartanoo M, Chang YW, Ha JS, Chong W, Hwang GR, Darvishi K, Kim H, Yang SJ, Yang KS, Kim H, Hurles ME, Scherer SW, Carter NP, Tyler-Smith C, Lee C, Seo JS: Discovery of common Asian copy number variants using integrated high-resolution array CGH and massively parallel DNA sequencing. Nat Genet 2010, 42(5):400-405.
  • [8]International HapMap C, Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, Pasternak S, Wheeler DA, Willis TD, Yu F, Yang H, Zeng C, Gao Y, Hu H, Hu W, Li C, Lin W, Liu S, Pan H, Tang X, Wang J, Wang W, Yu J, Zhang B, Zhang Q, et al.: A second generation human haplotype map of over 3.1 million SNPs. Nature 2007, 449(7164):851-861.
  • [9]Wineinger NE, Pajewski NM, Kennedy RE, Wojczynski MK, Vaughan LK, Hunt SC, Gu CC, Rao DC, Lorier R, Broeckel U, Arnett DK, Tiwari HK: Characterization of autosomal copy-number variation in African Americans: the HyperGEN Study. Eur J Hum Genet 2011, 19(12):1271-1275.
  • [10]Li C, Wong W: DNA-Chip Analyzer (dChip). In The Analysis of Gene Expression Data. Edited by Parmigiani G, Garrett E, Irizarry R, Zeger S. New York: Springer; 2003:120-141.
  • [11]Zhang D, Qian Y, Akula N, Alliey-Rodriguez N, Tang J, Bipolar Genome S, Gershon ES, Liu C: Accuracy of CNV Detection from GWAS Data. PloS One 2011, 6(1):e14511.
  • [12]Tsuang DW, Millard SP, Ely B, Chi P, Wang K, Raskind WH, Kim S, Brkanac Z, Yu CE: The effect of algorithms on copy number variant detection. PloS One 2010, 5(12):e14456.
  文献评价指标  
  下载次数:64次 浏览次数:46次