期刊论文详细信息
BMC Genomics
Are special read alignment strategies necessary and cost-effective when handling sequencing reads from patient-derived tumor xenografts?
Kevin Y Yip1  Kwok-Wai Lo2  Sau Dan Lee3  Kai-Yuen Tso3 
[1] CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong;Department of Anatomical and Cellular Pathology, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong;Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
关键词: High-throughput sequencing;    Contamination;    Nasopharyngeal carcinoma;    Xenografts;   
Others  :  1121505
DOI  :  10.1186/1471-2164-15-1172
 received in 2014-04-08, accepted in 2014-12-11,  发布年份 2014
PDF
【 摘 要 】

Background

Patient-derived tumor xenografts in mice are widely used in cancer research and have become important in developing personalized therapies. When these xenografts are subject to DNA sequencing, the samples could contain various amounts of mouse DNA. It has been unclear how the mouse reads would affect data analyses. We conducted comprehensive simulations to compare three alignment strategies at different mutation rates, read lengths, sequencing error rates, human-mouse mixing ratios and sequenced regions. We also sequenced a nasopharyngeal carcinoma xenograft and a cell line to test how the strategies work on real data.

Results

We found the "filtering" and "combined reference" strategies performed better than aligning reads directly to human reference in terms of alignment and variant calling accuracies. The combined reference strategy was particularly good at reducing false negative variants calls without significantly increasing the false positive rate. In some scenarios the performance gain of these two special handling strategies was too small for special handling to be cost-effective, but it was found crucial when false non-synonymous SNVs should be minimized, especially in exome sequencing.

Conclusions

Our study systematically analyzes the effects of mouse contamination in the sequencing data of human-in-mouse xenografts. Our findings provide information for designing data analysis pipelines for these data.

【 授权许可】

   
2014 Tso et al.; licensee BioMed Central.

【 预 览 】
附件列表
Files Size Format View
20150212023802311.pdf 1528KB PDF download
Figure 10. 71KB Image download
Figure 9. 20KB Image download
Figure 8. 20KB Image download
Figure 7. 16KB Image download
Figure 6. 18KB Image download
Figure 5. 16KB Image download
Figure 4. 18KB Image download
Figure 3. 19KB Image download
Figure 2. 15KB Image download
Figure 1. 17KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

【 参考文献 】
  • [1]Morton CL, Houghton PJ: Establishment of human tumor xenografts in immunodeficient mice. Nat Protoc 2007, 2(2):247-250.
  • [2]Richmond A, Su Y: Mouse xenograft models vs GEM models for human cancer therapeutics. Disease Models Mech 2008, 1(2–3):78-82.
  • [3]Sano D, Myers JN: Xenograft models of head and neck cancers. Head Neck Oncol 2009., 1(32)
  • [4]Meyer LH, Debatin K-M: Diversity of human leukemia xenograft mouse models: Implications for disease biology. Cancer Res 2011, 71(23):7141-7144.
  • [5]Bertilaccio MTS, Scielzo C, Simonetti G, Hacken ET, Apollonio B, Ghia P, Caligaris-Cappio F: Xenograft models of chronic lymphocytic leukemia: Problems, pitfalls and future directions. Leukemia 2013, 27(3):534-540.
  • [6]Siloas D, Hannon GJ: Patient-derived tumor xenografts: Transforming clinical samples into mouse models. Cancer Res 2013, 73(17):5315-5319.
  • [7]Lin M-T, Tseng L-H, Kamiyama H, Kamiyama M, Lim P, Hidalgo M, Wheelan S, Eshleman J: Quantifying the relative amount of mouse and human DNA in cancer xenografts using species-specific variation in gene length. Biotechniques 2010, 48(3):211-218.
  • [8]Attiyeh EF, Diskin SJ, Attiyeh MA, Mossé YP, Hou C, Jackson EM, Kim C, Glessner J, Hakonarson H, Biegel JA, Maris JM: Genomic copy number determination in cancer cells from single nucleotide polymorphism microarrays based on quantitative genotyping corrected for aneuploidy. Genome Res 2009, 19(2):276-283.
  • [9]Stjernqvist S, Rydén T, Greenman CD: Model-integrated estimation of normal tissue contamination for cancer SNP allelic copy number data. Cancer Informatics 2011, 10:159-173.
  • [10]Conway T, Wazny J, Bromage A, Tymms M, Sooraj D, Williams ED, Beresford-Smith B: Xenome—a tool for classifying reads from xenograft samples. Bioinformatics 2012, 28(12):172-178. doi:10.1093/bioinformatics/bts236
  • [11]Valdes C, Seo P, Tsinoremas N, Clarke J: Characteristics of crosshybridization and cross-alignment of expression in pseudoxenograft samples by RNA-seq and microarrays. J Clin Bioinf 2013., 3(8)
  • [12]BGI: Eliminating Host Contamination in Xenografts (Poster). 2012. [ http://archive.genomeconference.org/file/images/Eliminating-Host-Contamination-in-Xenografts.pdf webcite]
  • [13]Ding L, Ellis MJ, Li S, Larson DE, Chen K, Wallis JW, Harris CC, McLellan MD, Fulton RS, Fulton LL, Abbott RM, Hoog J, Dooling DJ, Koboldt DC, Schmidt H, Kalicki J, Zhang Q, Chen L, Lin L, Wendl MC, McMichael JF, Magrini VJ, Cook L, McGrath SD, Vickery TL, Appelbaum E, DeSchryver K, Davies S, Guintoli T, Lin L, et al.: Genome remodelling in a basal-like breast cancer metastasis and xenograft. Nature 2010, 464(7291):999-1005.
  • [14]Makalowski W, Zhang J, Boguski MS: Comparative analysis of 1196 orthologous mouse and human full-length mRNA and protein sequences. Genome Res 1996, 6(9):846-857.
  • [15]Langmead B, Salzberg SL: Fast gapped-read alignment with Bowtie 2. Nat Methods 2012, 9(4):357-359. doi:10.1038/nmeth.1923
  • [16]Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ: The UCSC table browser data retrieval tool. Nucleic Acids Res 2004, 32(Database issue):493-496. doi:10.1093/nar/gkh103
  • [17]Kent W, Sugnet C, Furey T, Roskin K, Pringle T, Zahler A, Haussler D: The human genome browser at UCSC. Genome Res 2002, 12(6):996-1006.
  • [18]Pruitt KD, Tatusova T, Maglott DR: NCBI reference sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res 2005, 33(Database issue):501-504. doi:10.1093/nar/gki025
  • [19]DeBry RW, Seldin MF: Human/mouse homology relationships. Genomics 1996, 33(3):337-351. doi:10.1006/geno.1996.0209
  • [20]Kamnasaran D: Epigenetic inheritance associated with human chromosome 14. Clin Invest Med 2001, 24(3):138-146.
  • [21]Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009, 25(14):1754-1760. doi:10.1093/bioinformatics/btp324
  • [22]Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Subgroup.GPDP: The sequence alignment/map format and SAMtools. Bioinformatics 2009, 25(16):2078-2079. doi:10.1093/bioinformatics/btp352
  • [23]Cock PJA, Fields CJ, Goto N, Heuer ML, Rice PM: The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 2010, 38(6):1767-1771. doi:10.1093/nar/gkp1137
  • [24]Busson P, Ganem G, Flores P, Mugneret F, Clausse B, Caillou B, Braham K, Wakasugi H, Lipinski M, Tursz T: Establishment and characterization of three transplantable EBV-containing nasopharyngeal carcinomas. Int J Cancer 1988, 42:599-606.
  • [25]Cheung ST, Huang DP, Hui ABY, Lo KW, Ko CW, Tsang YS, Wong N, Whitney BM, Lee JCK: Nasopharyngeal carcinoma cell line (C666-1) consistently harbouring Epstein-Barr virus. Int J Cancer 1999, 83:121-126.
  • [26]Dawson CW, Port RJ, Young LS: The role of the EBV-encoded latent membrane proteins LMP1 and LMP2 in the pathogenesis of nasopharyngeal carcinoma (NPC). Seminars Cancer Biol 2012, 22:144-153.
  文献评价指标  
  下载次数:8次 浏览次数:12次