Plant Methods | |
Next generation sequencing and de novo transcriptomics to study gene evolution | |
Joshua S Mylne3  James Whelan1  Oliver Berkowitz2  Kalia Bernath-Levin3  David Secco3  Achala S Jayasena3  | |
[1] La Trobe University, Department of Botany, School of Life Sciences & ARC Centre of Excellence in Plant Energy Biology, AgriBio, the Centre for AgriBioscience, 5 Ring Road, Melbourne, Bundoora Victoria 3086, Australia;The University of Western Australia, School of Plant Biology, 35 Stirling Highway, Crawley, Perth 6009, Australia;The University of Western Australia, School of Chemistry and Biochemistry & ARC Centre of Excellence in Plant Energy Biology, 35 Stirling Highway, Crawley, Perth 6009, Australia | |
关键词: Cyclic peptides; PawS1; Gene evolution; De novo transcriptomics; | |
Others : 1151514 DOI : 10.1186/1746-4811-10-34 |
|
received in 2014-07-30, accepted in 2014-10-08, 发布年份 2014 | |
【 摘 要 】
Background
Studying gene evolution in non-model species by PCR-based approaches is limited to highly conserved genes. The plummeting cost of next generation sequencing enables the application of de novo transcriptomics to any species.
Results
Here we describe how to apply de novo transcriptomics to pursue the evolution of a single gene of interest. We follow a rapidly evolving seed protein that encodes small, stable peptides. We use software that needs limited bioinformatics background and assemble four de novo seed transcriptomes. To demonstrate the quality of the assemblies, we confirm the predicted genes at the peptide level on one species which has over ten copies of our gene of interest. We explain strategies that favour assembly of low abundance genes, what assembly parameters help capture the maximum number of transcripts, how to develop a suite of control genes to test assembly quality and we compare several sequence depths to optimise cost and data volume.
Conclusions
De novo transcriptomics is an effective approach for studying gene evolution in species for which genome support is lacking.
【 授权许可】
2014 Jayasena et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150406083451145.pdf | 1415KB | download | |
Figure 5. | 101KB | Image | download |
Figure 4. | 130KB | Image | download |
Figure 3. | 37KB | Image | download |
Figure 2. | 186KB | Image | download |
Figure 1. | 63KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
Figure 5.
【 参考文献 】
- [1]Walker TM, Ip CLC, Harrell RH, Evans JT, Kapatai G, Dedicoat MJ, Eyre DW, Wilson DJ, Hawkey PM, Crook DW, Parkhill J, Harris D, Walker AS, Bowden R, Monk P, Smith EG, Peto TEA: Whole-genome sequencing to delineate Mycobacterium tuberculosis outbreaks: a retrospective observational study. Lancet Infect Dis 2013, 13:137-146.
- [2]Oono Y, Kobayashi F, Kawahara Y, Yazawa T, Handa H, Itoh T, Matsumoto T: Characterisation of the wheat (triticum aestivum L.) transcriptome by de novo assembly for the discovery of phosphate starvation-responsive genes: gene expression in Pi-stressed wheat. BMC Genomics 2013, 14:1-14. BioMed Central Full Text
- [3]Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, Nielsen T, Pons N, Levenez F, Yamada T, Mende DR, Li J, Xu J, Li S, Li D, Cao J, Wang B, Liang H, Zheng H, Xie Y, Tap J, Lepage P, Bertalan M, Batto J, Hansen T, Le Paslier D, Linneberg A, Nielsen HB, Pelletier E, Renault P, et al.: A human gut microbial gene catalogue established by metagenomic sequencing. Nature 2010, 464:59-65.
- [4]Cruickshanks HA, McBryan T, Nelson DM, VanderKraats ND, Shah PP, van Tuyn J, Singh Rai T, Brock C, Donahue G, Dunican DS, Drotar ME, Meehan RR, Edwards JR, Berger SL, Adams PD: Senescent cells harbour features of the cancer epigenome. Nat Cell Biol 2013, 15:1495-1506.
- [5]Darmanis S, Nong RY, Vänelid J, Siegbahn A, Ericsson O, Fredriksson S, Bäcklin C, Gut M, Heath S, Gut IG, Heath S, Gut IG, Wallentin L, Gustafsson MG, Kamali-Moghaddam M, Landegren U: ProteinSeq: High-performance proteomic analyses by proximity ligation and next generation sequencing. PLoS One 2011, 6:e25583.
- [6]Navin N, Kendall J, Troge J, Andrews P, Rodgers L, McIndoo J, Cook K, Stepansky A, Levy D, Esposito D, Muthuswamy L, Krasnitz A, McCombie WR, Hicks J, Wigler M: Tumour evolution inferred by single-cell sequencing. Nature 2011, 472:90-94.
- [7]Xiao M, Zhang Y, Chen X, Lee E, Barber CJS, Chakrabarty R, Desgagné-Penix I, Haslam TM, Kim Y, Liu E, MacNevin G, Masada-Atsumi S, Reed DW, Stout JM, Zerbe P, Zhang Y, Bohlmann J, Covello PS, De Luca V, Page JE, Ro DK, Martin VJ, Facchini PJ, Sensen CW: Transcriptome analysis based on next-generation sequencing of non-model plants producing specialized metabolites of biotechnological interest. J Biotechnol 2013, 166:122-134.
- [8]Novaes E, Drost DR, Farmerie WG, Pappas GJ Jr, Grattapaglia D, Sederoff RR, Kirst M: High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome. BMC Genomics 2008, 9:1-14. BioMed Central Full Text
- [9]Zhang J, Liang S, Duan J, Wang J, Chen S, Cheng Z, Zhang Q, Liang X, Li Y: De novo assembly and characterisation of the transcriptome during seed development, and generation of genic-SSR markers in Peanut (Arachis hypogaea L.). BMC Genomics 2012, 13:90. BioMed Central Full Text
- [10]Wan L, Han J, Sang M, Li A, Wu H, Yin S, Zhang C: De novo transcriptomic analysis of an oleaginous microalga: pathway description and gene discovery for production of next-generation biofuels. PLoS One 2012, 7:e35142.
- [11]Franssen S, Shrestha R, Brautigam A, Bornberg-Bauer E, Weber A: Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing. BMC Genomics 2011, 12:227. BioMed Central Full Text
- [12]Quail M, Smith M, Coupland P, Otto T, Harris S, Connor T, Bertoni A, Swerdlow H, Gu Y: A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 2012, 13:341. BioMed Central Full Text
- [13]Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke G, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A: Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotech 2011, 29:644-652.
- [14]Zerbino DR, Birney E: Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008, 18:821-829.
- [15]Huang X, Madan A: CAP3: A DNA sequence assembly program. Genome Res 1999, 9:868-877.
- [16]Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J: TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics 2003, 19:651-652.
- [17]Mylne JS, Colgrave ML, Daly NL, Chanson AH, Elliott AG, McCallum EJ, Jones A, Craik DJ: Albumins and their processing machinery are hijacked for cyclic peptides in sunflower. Nat Chem Biol 2011, 7:257-259.
- [18]Elliott AG, Delay C, Liu H, Phua Z, Rosengren KJ, Benfield AH, Panero JL, Colgrave ML, Jayasena AS, Dunse KM, Anderson MA, Schilling EE, Ortiz-Barrientos D, Craik DJ, Mylne JS: Evolutionary Origins of a Bioactive Peptide Buried within Preproalbumin. Plant Cell 2014, 26:981-995.
- [19]Kreis M, Shewry PR: Unusual features of cereal seed protein structure and evolution. Bioessays 1989, 10:201-207.
- [20]Luckett S, Garcia RS, Barker JJ, Konarev AV, Shewry PR, Clarke AR, Brady RL: High-resolution structure of a potent, cyclic proteinase inhibitor from sunflower seeds. J Mol Biol 1999, 290:525-533.
- [21]Rico M, Bruix M, González C, Monsalve RI, Rodríguez R: 1H NMR assignment and global fold of napin BnIb, a representative 2S albumin seed protein. Biochemistry 1996, 35:15672-15682.
- [22]Natali L, Cossu R, Barghini E, Giordani T, Buti M, Mascagni F, Morgante M, Gill N, Kane N, Rieseberg L, Cavallini A: The repetitive component of the sunflower genome as shown by different procedures for assembling next generation sequencing reads. BMC Genomics 2013, 14:686. BioMed Central Full Text
- [23]Martin JA, Wang Z: Next-generation transcriptome assembly. Nat Rev Genet 2011, 12:671-682.
- [24]Liu L, Li Y, Li S-L, Hu N, He Y, Pong R, Lin D, Lu L, Law M: Comparison of next-generation sequencing systems. J Biomed Biotechnol 2012, 2012:11.
- [25]Dorn KM, Fankhauser JD, Wyse DL, Marks MD: De novo assembly of the pennycress (Thlaspi arvense) transcriptome provides tools for the development of a winter cover crop and biodiesel feedstock. Plant J 2013, 75:1028-1038.
- [26]Bräutigam A, Mullick T, Schliesky S, Weber APM: Critical assessment of assembly strategies for non-model species mRNA-Seq data and application of next-generation sequencing to the comparison of C3 and C4 species. J Exp Bot 2011, 62:3093-3102.
- [27]Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Meth 2008, 5:621-628.
- [28]Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB: Alternative isoform regulation in human tissue transcriptomes. Nature 2008, 456:470-476.
- [29]O’Neil S, Emrich S: Assessing de novo transcriptome assembly metrics for consistency and utility. BMC Genomics 2013, 14:465. BioMed Central Full Text
- [30]Parra G, Bradnam K, Ning Z, Keane T, Korf I: Assessing the gene space in draft genomes. Nucleic Acids Res 2009, 37:289-297.
- [31]Dure L III, Croudh M: Developmental biochemistry of cotton seed embryogenesis, and termination: changing messenger ribonucleic and populations as shown by in vitro and in vivo protein synthesis. Biochemistry 1981, 20:4162-4168.
- [32]Hong-Bo S, Zong-Suo L, Ming-An S: LEA proteins in higher plants: structure, function, gene expression and regulation. Colloids Surf B Biointerfaces 2005, 45:131-135.
- [33]Siloto RMP, Findlay K, Lopez-Villalobos A, Yeung EC, Nykiforuk CL, Moloney MM: The accumulation of oleosins determines the size of seed oilbodies in arabidopsis. Plant Cell 2006, 18:1961-1974.
- [34]Chen X, Pfeil JE, Gal S: The three typical aspartic proteinase genes of Arabidopsis thaliana are differentially expressed. Eur J Biochem 2002, 269:4675-4684.
- [35]Simões I, Faro C: Structure and function of plant aspartic proteinases. Eur J Biochem 2004, 271:2067-2075.
- [36]van Loon LC, Rep M, Pieterse CMJ: Significance of inducible defense-related proteins in infected plants. Annu Rev Phytopathol 2006, 44:135-162.
- [37]Richau KH, Kaschani F, Verdoes M, Pansuriya TC, Niessen S, Stüber K, Colby T, Overkleeft HS, Bogyo M, Van der Hoorn RAL: Subclassification and biochemical analysis of plant papain-like cysteine proteases displays subfamily-specific characteristics. Plant Physiol 2012, 158:1583-1599.
- [38]Hardie DG: Plant protein serine/threonine kinases: classification and functions. Annu Rev Plant Physiol Plant Mol Biol 1999, 50:97-131.
- [39]Shewry PR, Napier JA, Tatham AS: Seed storage proteins: structures and biosynthesis. Plant Cell 1995, 7:945-956.
- [40]Haznedaroglu BZ, Reeves D, Rismani-Yazdi H, Peccia J: Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms. BMC Bioinformatics 2012, 13:170. BioMed Central Full Text
- [41]Croucher P, Brewer M, Winchell C, Oxford G, Gillespie R: De novo characterization of the gene-rich transcriptomes of two color-polymorphic spiders, Theridion grallator and T. californicum (Araneae: Theridiidae), with special reference to pigment genes. BMC Genomics 2013, 14:862. BioMed Central Full Text
- [42]Wu C-H, Tsai M-H, Ho C-C, Chen C-Y, Lee H-S: De novo transcriptome sequencing of axolotl blastema for identification of differentially expressed genes during limb regeneration. BMC Genomics 2013, 14:434. BioMed Central Full Text
- [43]De Wit P, Pespeni MH, Ladner JT, Barshis DJ, Seneca F, Jaris H, Therkildsen NO, Morikawa M, Palumbi SR: The simple fool’s gide to population genomics via RNA-Seq: an introduction to high-throughput sequencing data analysis. Mol Ecol Resour 2012, 12:1058-1067.
- [44]Mylne JS, Chan LY, Chanson AH, Daly NL, Schaefer H, Bailey TL, Nguyencong P, Cascales L, Craik DJ: Cyclic peptides arising by evolutionary parallelism via asparaginyl-endopeptidase-mediated biosynthesis. Plant Cell 2012, 24:2765-2778.
- [45]Botella JR, Arteca JM, Schlagnhaufer CD, Arteca RN, Phillips AT: Identification and characterization of a full-length cDNA encoding for an auxin-induced 1-aminocyclopropane-1-carboxylate synthase from etiolated mung bean hypocotyl segments and expression of its mRNA in response to indole-3-acetic acid. Plant Mol Biol 1992, 20:425-436.