BMC Bioinformatics | |
Identification of single nucleotide polymorphisms from the transcriptome of an organism with a whole genome duplication | |
Kris A Christensen1  Joseph P Brunelli1  Matthew J Lambert1  Jenefer DeKoning1  Ruth B Phillips1  Gary H Thorgaard1  | |
[1] Center for Reproductive Biology, Washington State University, Pullman WA 99164-7520, USA | |
关键词: Genome duplication; Rainbow trout; Polyploid; SNP; | |
Others : 1087712 DOI : 10.1186/1471-2105-14-325 |
|
received in 2013-06-28, accepted in 2013-11-12, 发布年份 2013 | |
【 摘 要 】
Background
The common ancestor of salmonid fishes, including rainbow trout (Oncorhynchus mykiss), experienced a whole genome duplication between 20 and 100 million years ago, and many of the duplicated genes have been retained in the trout genome. This retention complicates efforts to detect allelic variation in salmonid fishes. Specifically, single nucleotide polymorphism (SNP) detection is problematic because nucleotide variation can be found between the duplicate copies (paralogs) of a gene as well as between alleles.
Results
We present a method of differentiating between allelic and paralogous (gene copy) sequence variants, allowing identification of SNPs in organisms with multiple copies of a gene or set of genes. The basic strategy is to: 1) identify windows of unique cDNA sequences with homology to each other, 2) compare these unique cDNAs if they are not shared between individuals (i.e. the cDNA is homozygous in one individual and homozygous for another cDNA in the other individual), and 3) give a “SNP score” value between zero and one to each candidate sequence variant based on six criteria. Using this strategy we were able to detect about seven thousand potential SNPs from the transcriptomes of several clonal lines of rainbow trout. When directly compared to a pre-validated set of SNPs in polyploid wheat, we were also able to estimate the false-positive rate of this strategy as 0 to 28% depending on parameters used.
Conclusions
This strategy has an advantage over traditional techniques of SNP identification because another dimension of sequencing information is utilized. This method is especially well suited for identifying SNPs in polyploids, both outbred and inbred, but would tend to be conservative for diploid organisms.
【 授权许可】
2013 Christensen et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150117033434127.pdf | 567KB | download | |
Figure 4. | 41KB | Image | download |
Figure 3. | 30KB | Image | download |
Figure 2. | 27KB | Image | download |
Figure 1. | 52KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
【 参考文献 】
- [1]Koop BF, Davidson WS: Genomics and the genome duplication in salmonids. In Fisheries for Global Welfare and Environment, 5th World Fisheries Congress 2008. Edited by Tsukamoto K, Kawamura T, Takeuchi T, Beard TD, Kaiser J, Kaiser MJ. Tokyo: TERRAPUB; 2008:77-86.
- [2]Allendorf FW, Thorgaard GH: Tetraploidy and the evolution of salmonid fishes. In Evolutionary Genetics of Fishes. Edited by Turner BJ. New York: Plenum Press; 1984:1-53.
- [3]Palti Y, Genet C, Luo MC, Charlet A, Gao G, Hu Y, Castaño-Sánchez C, Tabet-Canale K, Krieg F, Yao J, Vallejo RL, Rexroad CE III: A first generation integrated map of the rainbow trout genome. BMC Genomics 2011, 12:180. BioMed Central Full Text
- [4]Guyomard R, Boussaha M, Krieg F, Hervet C, Quillet E: A synthetic rainbow trout linkage map provides new insights into the salmonid whole genome duplication and the conservation of synteny among teleosts. BMC Genet 2012, 13:15.
- [5]Danzmann RG, Davidson EA, Ferguson MM, Gharbi K, Koop BF, Hoyheim B, Lien S, Lubieniecki KP, Moghadam HK, Park J, Phillips RB, Davidson WS: Distribution of ancestral proto-Actinopterygian chromosome arms within the genomes of 4R-derivative salmonid fishes (rainbow trout and Atlantic salmon). BMC Genomics 2008, 9:557. BioMed Central Full Text
- [6]Dehal P, Boore JL: Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol 2005, 3(10):e314.
- [7]Otto SP, Yong P: The evolution of gene duplicates. Adv Genet 2002, 46:451-483.
- [8]Lewis WH: Polyploidy: Biological Relevance. New York: Plenum; 1980.
- [9]Hegarty MJ, Hiscock SJ: Genomic clues to the evolutionary success of polyploid plants. Curr Biol 2008, 18:R435-R444.
- [10]Ilut DC, Coate JE, Luciano AK, Owens TG, May GD, Farmer A, Doyle JJ: A comparative transcriptomic study of an allotetraploid and its diploid progenitors illustrates the unique advantages and challenges of RNA-seq in plant species. Am J Bot 2012, 99(2):383-396.
- [11]Brieuc MS, Naish KA: Detecting signatures of positive selection in partial sequences generated on a large scale: pitfalls, procedures and resources. Mol Ecol Resour 2011, 11(Suppl. 1):172-183.
- [12]Moghadam HK, Ferguson MM, Danzmann RG: Whole genome duplication: challenges and considerations associated with sequence orthology assignment in Salmoninae. J Fish Biol 2011, 79:561-574.
- [13]Abadía-Cardoso A, Clemento AJ, Garza JC: Discovery and characterization of single-nucleotide polymorphisms in steelhead/rainbow trout. Oncorhynchus mykiss. Mol Ecol Resour 2011, 11(Suppl. 1):31-49.
- [14]Castaño-Sánchez C, Palti Y, Rexroad C III: SNP analysis with duplicated fish genomes: differentiation of SNPs, paralogous sequence variants, and multi-site variants. In Next Generation Sequencing and Whole Genome Selection in Aquaculture. Edited by Liu ZJ. Malden, MA: Wiley-Blackwell; 2011:133-150.
- [15]Vignal A, Milan D, SanCristobal M, Eggen A: A review on SNP and other types of molecular markers and their use in animal genetics. Genet Sel Evolu 2002, 34:275-305. BioMed Central Full Text
- [16]Marth GT: Computational SNP discovery in DNA sequence data. In Methods in Molecular Biology, vol. 212: Single Nucleotide Polymorphisms Methods and Protocols. Edited by Kwok P-Y. New York: Springer; 2003:85-110.
- [17]Iorizzo M, Senalik DA, Grzebelus D, Bowman M, Cavagnaro PF, Matvienko M, Ashrafi H, Deynze AV, Simon PW: De novo assembly and characterization of the carrot transcriptome reveals novel genes, new markers, and genetic diversity. BMC Genomics 2011, 12:389. BioMed Central Full Text
- [18]Wang S, Sha Z, Sonstegard TS, Liu H, Xu P, Somridhivej B, Peatman E, Kucuktas H, Liu Z: Quality assessment parameters for EST-derived SNPs from catfish. BMC Genomics 2008, 9:450. BioMed Central Full Text
- [19]Renaut S, Nolte AW, Bernatchez L: Mining transcriptome sequences towards identifying adaptive single nucleotide polymorphisms in lake whitefish species pairs (Coregonus spp. Salmonidae). Mol Ecol 2010, 19(Suppl. 1):115-131.
- [20]Seeb JE, Pascal CE, Grau ED, Seeb LW, Templin WD, Harkins T, Roberts SB: Transcriptome sequencing and high-resolution melt analysis advance single nucleotide polymorphism discovery in duplicated salmonids. Mol Ecol Resour 2011, 11:335-348.
- [21]Cirulli ET, Singh A, Shianna KV, Ge D, Smith JP, Maia JM, Heinzen EL, Goedert JJ, Goldstein DB: CHAVI: Screening the human exome: a comparison of whole genome and whole transcriptome sequencing. Genome Biol 2010, 11:R57. BioMed Central Full Text
- [22]Tang J, Vosman B, Voorrips RE, van der Linden CG, Leunissen JAM: QualitySNP: a pipeline for detecting single nucleotide polymorphisms and insertions/deletions in EST data from diploid and polyploid species. BMC Bioinforma 2006, 7:438. BioMed Central Full Text
- [23]Altshuler D, Pollara VJ, Cowles CR, Van Etten WJ, Baldwin J, Linton L, Lander ES: An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature 2000, 407:513-516.
- [24]Marth GT, Korf I, Yandell MD, Yeh RT, Gu Z, Zakeri H, Stitziel NO, Hillier L, Kwok PY, Gish WR: A general approach to single-nucleotide polymorphism discovery. Nat Genet 1999, 23:452-456.
- [25]Barbazuk WB, Emrich SJ, Chen HD, Li L, Schnable PS: SNP discovery via 454 transcriptome sequencing. Plant J 2007, 51:910-918.
- [26]Meyer E, Aglyamova GV, Wang S, Buchanan-Carter J, Abrego D, Colbourne JK, Willis BL, Matz MV: Sequencing and de novo analysis of a coral larval transcriptome using 454 GSFlx. BMC Genomics 2009, 10:219. BioMed Central Full Text
- [27]Studer B, Byrne S, Nielsen RO, Panitz F, Bendixen C, Islam MS, Pfeifer M, Lübberstedt T, Asp T: A transcriptome map of perennial ryegrass (Lolium perenne L.). BMC Genomics 2012, 13:140. BioMed Central Full Text
- [28]Han Y, Kang Y, Torres-Jerez I, Cheung F, Town CD, Zhao PX, Udvardi MK, Monteros MJ: Genome-wide SNP discovery in tetraploid alfalfa using 454 sequencing and high resolution melting analysis. BMC Genomics 2011, 12:350. BioMed Central Full Text
- [29]Everett MV, Grau ED, Seeb JE: Short reads and nonmodel species: exploring the complexities of next-generation sequence assembly and SNP discovery in the absence of a reference genome. Mol Ecol Resour 2011, 11(Suppl. 1):93-108.
- [30]Miller MR, Brunelli JP, Wheeler PA, Liu S, Rexroad CE III, Palti Y, Doe CQ, Thorgaard GH: A conserved haplotype controls parallel adaptation in geographically distant salmonid populations. Mol Ecol 2012, 21(2):237-249.
- [31]Trick M, Long Y, Meng J, Bancroft I: Single nucleotide polymorphism (SNP) discovery in the polyploid Brassica napus using solexa transcriptome sequencing. Plant Biotechnol J 2009, 7:334-346.
- [32]Buetow KH, Edmonson MN, Cassidy AB: Reliable identification of large numbers of candidate SNPs from public EST data. Nat Genet 1999, 21:323-325.
- [33]Zhang Z, Schwartz S, Wagner L, Miller W: A greedy algorithm for aligning DNA sequences. J Comput Biol 2000, 7(1–2):203-214.
- [34]Novocraft Technologies. http://www.novocraft.com webcite
- [35]Brown GD: An analysis of salmonid RNA sequences and implications for salmonid evolution. University of Victoria, Department of Computer Science: PhD thesis; 2008.
- [36]Trick M, Adamski NM, Mugford SG, Jiang CC, Febrer M, Uauy C: Combining SNP discovery from next-generation sequencing data with bulked segregant analysis (BSA) to fine-map genes in polyploid wheat. BMC Plant Biol 2012, 12:14. BioMed Central Full Text
- [37]Huang S, Sirikhachornkit A, Su X, Faris J, Gill B, Haselkorn R, Gornicki P: Genes encoding plastid acetyl-CoA carboxylase and 3-phosphoglycerate kinase of the Triticum/Aegilops complex and the evolutionary history of polyploid wheat. Proc Natl Acad Sci USA 2002, 99(12):8133-8138.
- [38]Robison BD, Wheeler PA, Thorgaard GH: Variation in development rate among clonal lines of rainbow trout (Oncorhynchus mykiss). Aquaculture 1999, 173:131-141.
- [39]Simms D, Cizdziel PE, Chomczynski P: TRIzoltm: a new reagent for optimal single-step isolation of RNA. Focus 1993, 15:99.
- [40]Sambrook J, Fritsch EF, Maniatis T: Molecular cloning, a laboratory manual. 2nd edition. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press; 1989.
- [41]454 sequencing. http://www.454.com webcite
- [42]UniGene. http://www.ncbi.nlm.nih.gov/UniGene webcite
- [43]The Sequence Read Archive (SRA). http://www.ncbi.nlm.nih.gov/sra webcite
- [44]Ye J, Coulouris G, Zaretskaya I, Cutcutache I, Rozen S, Madden TL: Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction. BMC Bioinforma 2012, 13:134. BioMed Central Full Text