期刊论文详细信息
BMC Bioinformatics
Simple binary segmentation frameworks for identifying variation in DNA copy number
Tae Young Yang1 
[1] Department of Mathematics, Myongji University, Yongin, Kyonggi, 449-728, Korea
关键词: Variation in DNA copy number;    Overall molecular signature;    Consensus molecular signature;    Circular binary segmentation;    Bayesian information criterion;   
Others  :  1088093
DOI  :  10.1186/1471-2105-13-277
 received in 2012-04-30, accepted in 2012-10-22,  发布年份 2012
PDF
【 摘 要 】

Background

Variation in DNA copy number, due to gains and losses of chromosome segments, is common. A first step for analyzing DNA copy number data is to identify amplified or deleted regions in individuals. To locate such regions, we propose a circular binary segmentation procedure, which is based on a sequence of nested hypothesis tests, each using the Bayesian information criterion.

Results

Our procedure is convenient for analyzing DNA copy number in two general situations: (1) when using data from multiple sources and (2) when using cohort analysis of multiple patients suffering from the same type of cancer. In the first case, data from multiple sources such as different platforms, labs, or preprocessing methods are used to study variation in copy number in the same individual. Combining these sources provides a higher resolution, which leads to a more detailed genome-wide survey of the individual. In this case, we provide a simple statistical framework to derive a consensus molecular signature. In the framework, the multiple sequences from various sources are integrated into a single sequence, and then the proposed segmentation procedure is applied to this sequence to detect aberrant regions. In the second case, cohort analysis of multiple patients is carried out to derive overall molecular signatures for the cohort. For this case, we provide another simple statistical framework in which data across multiple profiles is standardized before segmentation. The proposed segmentation procedure is then applied to the standardized profiles one at a time to detect aberrant regions. Any such regions that are common across two or more profiles are probably real and may play important roles in the cancer pathogenesis process.

Conclusions

The main advantages of the proposed procedure are flexibility and simplicity.

【 授权许可】

   
2012 Yang; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150117073559694.pdf 1760KB PDF download
Figure 9. 32KB Image download
Figure 8. 28KB Image download
Figure 7. 31KB Image download
Figure 6. 54KB Image download
Figure 5. 61KB Image download
Figure 4. 51KB Image download
Figure 3. 24KB Image download
Figure 2. 29KB Image download
Figure 1. 38KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

【 参考文献 】
  • [1]Lipson D, Aumann Y, Ben-Dor A, Linial N, Yakhini Z: Efficient calculation of interval scores for DNA copy number data analysis. J Comput Biol 2006, 13:215-228.
  • [2]Sun W, Wright FA, Tang Z, Nordgard SH, Loo PV, Yu T, Kristensen VN, Perou CM: Integrated study of copy number states and genotype calls using high-density SNP arrays. Nucleic Acids Res 2009, 37:5365-5377.
  • [3]Shen J, Zhang N: Change-point model on nonhomogeneous Poisson processes with application in copy number profiling by next-generation DNA sequencing. Ann Appl Stat 2012, 6:476-496.
  • [4]Pollack JR, Perou CM, Alizadeh AA, Eisen MB, Pergamenschikov A, Williams CF, Jeffrey SS, Botstein D, Brown PO: Genome-Wide Analysis of DNA Copy-Number Changes Using cDNA Microarrays. Nat Genet 1999, 23:41-46.
  • [5]Snijders AM, Nowak N, Segraves R, Blackwood S, Brown N, Conroy J, Hamilton G, Hindle AK, Huey B, Kimura K, Law S, Myambo K, Palmer J, Ylstra B, Yue JP, Gray JW, Jain AN, Pinkel D, Albertson DG: Assembly of Microarrays for Genome-Wide Measurement of DNA Copy Number. Nat Genet 2001, 29:263-264.
  • [6]Vostrikova L: Detecting disorder in multidimensional random process. Soviet Math Dokl 1981, 24:55-59.
  • [7]Olshen AB, Venkatraman ES, Lucito R, Wigler M: Circular binary segmentation for the analysis of array-based dna copy number data. Biostatistics 2004, 5:557-572.
  • [8]Schwarz G: Estimating the dimension of a model. Ann Statist 1978, 6:461-464.
  • [9]Zhang NR, Siegmund D: A modified bayes information criterion with applications to the analysis of comparative genomic hybridization data. Biometrics 2007, 63:22-32.
  • [10]Yang TY, Kuo L: Bayesian binary segmentation procedure for a Poisson process with multiple changepoints. J Comput Graphical Statist 2001, 10:772-785.
  • [11]Yang TY: Bayesian binary segmentation procedure for detecting streakiness in sports. J R Stat Soc Ser A 2004, 167:627-637.
  • [12]Pollack JR, Srlie T, Perou CM, Rees CA, Jeffrey SS, Lonning PE, Tibshirani R, Botstein D, Brresen-Dale AL, Brown PO: Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc Natl Acad Sci USA 2002, 99:12963-12968.
  • [13]Bengtsson H, Ray A, Spellman P, Speed T: A single-sample method for normalizing and combining full-resolution copy numbers from multiple platforms, labs and analysis methods. Bioinformatics 2009, 25:861-867.
  • [14]Zhang NR, Senbabaoglu Y, Li JZ: Joint estimation of DNA copy number from multiple platforms. Bioinformatics 2010, 26:153-160.
  • [15]The TCGA Research Network: Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 2008, 455:1061-1068.
  • [16]Picard F, Robin S, Lavielle M, Vaisse C, Daudin JJ: A statistical approach for array CGH data analysis. BMC Bioinformatics 2005, 6:27. BioMed Central Full Text
  • [17]Hupe P, Stransky N, Thiery JP, Radvanyi F, Barillot E: Analysis of array CGH data: from signal ratio to gain and loss of DNA regions. Bioinformatics 2004, 20(18):3413-3422.
  • [18]Lai W, Choudhary V, Park PJ: CGHweb: a tool for comparing DNA copy number segmentations from multiple algorithms. Bioinformatics 2008, 24(7):1014-1015.
  文献评价指标  
  下载次数:287次 浏览次数:99次