BMC Research Notes | |
RNA sequencing read depth requirement for optimal transcriptome coverage in Hevea brasiliensis | |
Zainorlina Mohd-Zainuddin2  Chee-Choong Hoh1  Ahmad-Kamal Ghazali1  Keng-See Chow2  | |
[1] Codon Genomics SB, No. 26, Jalan Dutamas 7, Taman Dutamas, Balakong 43200, Seri Kembangan Balakong, Selangor, Malaysia;Biotechnology Unit, Malaysian Rubber Board, Rubber Research Institute of Malaysia, Experiment Station, Kuala Lumpur 47000, Sungai Buloh, Selangor, Malaysia | |
关键词: Gene transcript; de novo assembly; Rubber tree; Hevea brasiliensis; Sequencing; RNA-Seq; Transcriptome; | |
Others : 1134662 DOI : 10.1186/1756-0500-7-69 |
|
received in 2013-09-11, accepted in 2014-01-17, 发布年份 2014 | |
【 摘 要 】
Background
One of the concerns of assembling de novo transcriptomes is determining the amount of read sequences required to ensure a comprehensive coverage of genes expressed in a particular sample. In this report, we describe the use of Illumina paired-end RNA-Seq (PE RNA-Seq) reads from Hevea brasiliensis (rubber tree) bark to devise a transcript mapping approach for the estimation of the read amount needed for deep transcriptome coverage.
Findings
We optimized the assembly of a Hevea bark transcriptome based on 16 Gb Illumina PE RNA-Seq reads using the Oases assembler across a range of k-mer sizes. We then assessed assembly quality based on transcript N50 length and transcript mapping statistics in relation to (a) known Hevea cDNAs with complete open reading frames, (b) a set of core eukaryotic genes and (c) Hevea genome scaffolds. This was followed by a systematic transcript mapping process where sub-assemblies from a series of incremental amounts of bark transcripts were aligned to transcripts from the entire bark transcriptome assembly. The exercise served to relate read amounts to the degree of transcript mapping level, the latter being an indicator of the coverage of gene transcripts expressed in the sample. As read amounts or datasize increased toward 16 Gb, the number of transcripts mapped to the entire bark assembly approached saturation. A colour matrix was subsequently generated to illustrate sequencing depth requirement in relation to the degree of coverage of total sample transcripts.
Conclusions
We devised a procedure, the “transcript mapping saturation test”, to estimate the amount of RNA-Seq reads needed for deep coverage of transcriptomes. For Hevea de novo assembly, we propose generating between 5–8 Gb reads, whereby around 90% transcript coverage could be achieved with optimized k-mers and transcript N50 length. The principle behind this methodology may also be applied to other non-model plants, or with reads from other second generation sequencing platforms.
【 授权许可】
2014 Chow et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150306031849270.pdf | 1310KB | download | |
Figure 4. | 64KB | Image | download |
Figure 3. | 58KB | Image | download |
Figure 2. | 26KB | Image | download |
Figure 1. | 89KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
【 参考文献 】
- [1]Wang Z, Gerstein M, Snyder M: RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 2009, 10:57-63.
- [2]Ozsolak F, Milos PM: RNA sequencing: advances, challenges and opportunities. Nat Rev Genet 2011, 12:87-98.
- [3]Van Verk MC, Hickman R, Pieterse CMJ, Van Wees SCM: RNA-Seq: revelation of the messengers. Trends Plant Sci 2013, 18:175-179.
- [4]Zerbino DR, Birney E: Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008, 18:821-829.
- [5]Schulz MH, Zerbino DR, Vingron M, Birney E: Oases: robust de novo RNA-Seq assembly across the dynamic range of expression labels. Bioinformatics 2012, 28:1086-1092.
- [6]Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K, Li S, Yang H, Wang J, Wang J: De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 2010, 20:265-272.
- [7]Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJM, Birol İ: ABySS: a parallel assembler for short read sequence data. Genome Res 2009, 19:1117-1123.
- [8]Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A: Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 2011, 29:644-654.
- [9]Chevreux B, Pfisterer T, Drescher B, Driesel AJ, Müller WEG, Wetter T, Suhai S: Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Res 2004, 14:1147-1159.
- [10]Haas BJ, Zody MC: Advancing RNA-Seq analysis. Nat Biotechnol 2010, 28:421-423.
- [11]Martin JA, Wang Z: Next-generation transcriptome assembly. Nat Rev Genet 2011, 12:671-682.
- [12]Ward JA, Ponnala L, Weber CA: Strategies for transcriptome analysis in non-model plants. Am J Bot 2012, 99:267-276.
- [13]Barrero RA, Chapman B, Yang Y, Moolhuijzen P, Keeble-Gagnère G, Zhang N, Tang Q, Bellgard MI, Qiu D: De novo assembly of Euphorbia fischeriana root transcriptome identifies prostratin pathway related genes. BMC Genomics 2011, 12:600. BioMed Central Full Text
- [14]Bräutigam A, Mullick T, Schliesky S, Weber APM: Critical assessment of assembly strategies for non-model species mRNA-Seq data and application of next-generation sequencing to the comparison of C3 and C4 species. J Exp Bot 2011, 62:3093-3102.
- [15]Garg R, Patel RK, Tyagi AK, Jain M: De novo assembly of chickpea transcriptome using short reads for gene discovery and marker identification. DNA Res 2011, 18:53-63.
- [16]Lin Y, Li J, Shen H, Zhang L, Papasian CJ, Deng HW: Comparative studies of de novo assembly tools for next-generation sequencing technologies. Bioinformatics 2011, 27:2031-2037.
- [17]Zhao QY, Wang Y, Kong YM, Luo D, Li X, Hao P: Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study. BMC Bioinformatics 2011, 12:S2.
- [18]Van Bakel H, Stout JM, Cote AG, Tallon CM, Sharpe AG, Hughes TR, Page JE: The draft genome and transcriptome of Cannabis sativa. Genome Biol 2011, 12:R102. BioMed Central Full Text
- [19]Wong MML, Cannon CH, Wickneswari R: Identification of lignin genes and regulatory sequences involved in secondary cell wall formation in Acacia auriculiformis and Acacia mangium via de novo transcriptome sequencing. BMC Genomics 2011, 12:342. BioMed Central Full Text
- [20]Mundry M, Bornberg-Bauer E, Sammeth M, Feulner PGD: Evaluating characteristics of de novo assembly software on 454 transcriptome data: a simulation approach. PLoS One 2012, 7:e31410.
- [21]O’Rourke JA, Yang SS, Miller SS, Bucciarelli B, Liu J, Rydeen A, Bozsoki Z, Uhde-Stone C, Tu ZJ, Allan D, Gronwald JW, Vance CP: An RNA-Seq transcriptome analysis of orthophosphate-deficient white lupin reveals novel insights into phosphorus acclimation in plants. Plant Physiol 2013, 161:705-724.
- [22]Collins LJ, Biggs PJ, Voelckel C, Joly S: An approach to transcriptome analysis of non-model organisms using short-read sequences. Genome Inform 2008, 21:3-14.
- [23]Parchman TL, Geist KS, Grahnen JA, Benkman CW, Buerkle CA: Transcriptome sequencing in an ecologically important tree species: assembly, annotation and marker discovery. BMC Genomics 2010, 11:180. BioMed Central Full Text
- [24]Surget-Groba Y, Montoya-Burgos JI: Optimization of de novo transcriptome assembly from next-generation sequencing data. Genome Res 2010, 20:1432-1440.
- [25]Gruenheit N, Deusch O, Esser C, Becker M, Voelckel C, Lockhart P: Cutoffs and k-mers: implications from a transcriptome study in allopolyploid plants. BMC Genomics 2012, 13:92. BioMed Central Full Text
- [26]Gordo SMC, Pinheiro DG, Moreira ECO, Rodrigues SM, Poltronieri MC, de Lemos OF, Da Silva IT, Ramos RTJ, Silva A, Schneider H, Silva WA Jr, Sampaio I, Darnet S: High-throughput sequencing of black pepper root transcriptome. BMC Plant Biol 2012, 12:168. BioMed Central Full Text
- [27]Iorizzo M, Senalik DA, Grzebelus D, Bowman M, Cavagnaro PF, Matvienko M, Ashrafi H, Van Deynze A, Simon PW: De novo assembly and characterization of the carrot transcriptome reveals novel genes, new markers, and genetic diversity. BMC Genomics 2011, 12:389. BioMed Central Full Text
- [28]Su CI, Chao YT, Chang YCA, Chen WC, Chen CY, Lee AY, Hwa KT, Shih MC: De novo assembly of expressed transcripts and global analysis of the Phalaenopsis aphrodite transcriptome. Plant Cell Physiol 2011, 52:1501-1514.
- [29]Chow K-S, Mat-Isa MN, Bahari A, Ghazali A-K, Alias H, Mohd-Zainuddin Z, Hoh C-C, Wan K-L: Metabolic routes affecting rubber biosynthesis in Hevea brasiliensis latex. J Exp Bot 2012, 63:1863-1871.
- [30]Kudapa H, Bharti AK, Cannon SB, Farmer AD, Mulaosmanovic B, Kramer R, Bohra A, Weeks NT, Crow JA, Tuteja R, Shah T, Dutta S, Gupta DK, Singh A, Gaikwad K, Sharma TK, May GD, Singh NK, Varshney RK: A comprehensive transcriptome assembly of pigeonpea (Cajanus cajan L.) using Sanger and second-generation sequencing platforms. Mol Plant 2012, 5:1020-1028.
- [31]Hao DC, Ge GB, Xiao PG, Zhang YY, Yang L: The first insight into the tissue specific Taxus transcriptome via Illumina second generation sequencing. PLoS One 2011, 6:e21220.
- [32]Logacheva MD, Kasianov AS, Vinogradov DV, Samigullin TH, Gelfand MS, Makeev VJ, Penin AA: De novo sequencing and characterization of floral transcriptome in two species of buckwheat (Fagopyrum). BMC Genomics 2011, 12:30. BioMed Central Full Text
- [33]Mizrachi E, Hefer CA, Ranik M, Jourbert F, Myburg AA: De novo assembled expressed gene catalog of a fast-growing Eucalyptus tree produced by Illumina mRNA-Seq. BMC Genomics 2010, 11:681. BioMed Central Full Text
- [34]Franssen SU, Shrestha RP, Bräutigam A, Bornberg-Bauer E, Weber APM: Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing. BMC Genomics 2011, 12:227. BioMed Central Full Text
- [35]Krishnan NM, Pattnaik S, Deepak SA, Hariharan AK, Gaur P, Chaudhary R, Jain P, Vaidyanathan S, Krishna PGB, Panda B: De novo sequencing and assembly of Azadirachta indica fruit transcriptome. Curr Sci 2011, 101:1553-1561.
- [36]Natarajan P, Parani M: De novo assembly and transcriptome analysis of five major tissues of Jatropha curcas L. using GS FLX titanium platform of 454 pyrosequencing. BMC Genomics 2011, 12:191. BioMed Central Full Text
- [37]Sui C, Zhang J, Wei J, Chen S, Li Y, Xu J, Jin Y, Xie C, Gao Z, Chen H, Yang C, Zhang Z, Xu Y: Transcriptome analysis of Bupleurum chinense focusing on genes involved in the biosynthesis of saikosaponins. BMC Genomics 2011, 12:539. BioMed Central Full Text
- [38]Duan J, Xia C, Zhao G, Jia J, Kong X: Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data. BMC Genomics 2012, 13:392. BioMed Central Full Text
- [39]Hyun TK, Rim Y, Jang HJ, Kim CH, Park J, Kumar R, Lee S, Kim BC, Bhak J, Nguyen-Quoc B, Kim SW, Lee SY, Kim JY: De novo transcriptome sequencing of Momordica cochinchinensis to identify genes involved in the carotenoid biosynthesis. Plant Mol Biol 2012, 79:413-427.
- [40]Lulin H, Xiao Y, Pei S, Wen T, Shangqin H: The first Illumina-based de novo transcriptome sequencing and analysis of safflower flowers. PLoS One 2012, 7:e38653.
- [41]Mutasa-Göttgens ES, Joshi A, Holmes HF, Hedden P, Göttgens B: A new RNA Seq-based reference transcriptome for sugar beet and its application in transcriptome-scale analysis of vernalization and gibberellin responses. BMC Genomics 2012, 13:99. BioMed Central Full Text
- [42]Sloan DB, Keller SR, Berardi AE, Sanderson BJ, Karpovich JF, Taylor DR: De novo transcriptome assembly and polymorphism detection in the flowering plant Silene vulgaris (Caryophyllaceae). Mol Ecol Resour 2012, 12:333-343.
- [43]Zhang XM, Zhao L, Larson-Rabin Z, Li DZ, Guo ZH: De novo sequencing and characterization of the floral transcriptome of Dendrocalamus latiflorus (Poaceae: Bambusoideae). PLoS One 2012, 7:e42082.
- [44]Zhao Z, Tan L, Dang C, Zhang H, Wu Q, An L: Deep-sequencing transcriptome analysis of chilling tolerance mechanisms of a subnival alpine plant, Chorispora bungeana. BMC Plant Biol 2012, 12:222. BioMed Central Full Text
- [45]Bai S, Saito T, Sakamoto D, Ito A, Fujii H, Moriguchi T: Transcriptome analysis of Japanese pear (Pyrus pyrifolia Nakai.) flower buds transitioning through endodormancy. Plant Cell Physiol 2013, 54:1132-1151.
- [46]Gil-Amado JA, Gomez-Jimenez MC: Transcriptome analysis of mature fruit abscission control in olive. Plant Cell Physiol 2013, 54:244-269.
- [47]Ramilowski JA, Sawai S, Seki H, Mochida K, Yoshida T, Sakurai T, Muranaka T, Saito K, Daub CO: Glycyrrhiza uralensis transcriptome landscape and study of phytochemicals. Plant Cell Physiol 2013, 54:697-710.
- [48]Van Moerkercke A, Fabris M, Pollier J, Baart GJE, Rombauts S, Hasnain G, Rischer H, Memelink J, Oksman-Caldentey KM, Goossens A: CathaCyc, a metabolic pathway database built from Catharanthus roseus RNA-Seq data. Plant Cell Physiol 2013, 54:673-685.
- [49]Zhang J, Wu K, Zeng S, da Silva JAT, Zhao X, Tian CE, Xia H, Duan J: Transcriptome analysis of Cymbidium sinense and its application to the identification of genes associated with floral development. BMC Genomics 2013, 14:279. BioMed Central Full Text
- [50]Bennett MD, Leitch I: Nuclear DNA amounts in angiosperms-583 new estimates. Ann Bot 1997, 80:169-196.
- [51]Han KH, Shin DH, Yang J, Kim IJ, Oh SK, Chow KS: Genes expressed in the latex of Hevea brasiliensis. Tree Physiol 2000, 20:503-510.
- [52]Ko J-H, Chow K-S, Han K-H: Transcriptome analysis reveals novel features of the molecular events occurring in the laticifers of Hevea brasiliensis (para rubber tree). Plant Mol Biol 2003, 53:479-492.
- [53]Chow K-S, Wan K-L, Mat-Isa M-N, Bahari A, Tan S-H, Harikrishna K, Yeang H-Y: Insights into rubber biosynthesis from transcriptome analysis of Hevea brasiliensis latex. J Exp Bot 2007, 58:2429-2440.
- [54]Mat-Isa M-N, Chow K-S, Mohamad A-F-H, Shahrum M-Y, Hoh C-C, Mohd-Amin M-R, Zainal K-A, Yeang H-Y, Wan K-L: NRESTdb: access to the transcriptome of natural rubber latex. J Rubber Res 2009, 12:229-238.
- [55]Xia Z, Xu H, Zhai J, Li D, Luo H, He C, Huang X: RNA-Seq analysis and de novo transcriptome assembly of Hevea brasiliensis. Plant Mol Biol 2011, 77:299-308.
- [56]Pootakham W, Chanprasert J, Jomchai N, Sangsrakru D, Yoocha T, Therawattanasuk K, Tangphatsornruang S: Single nucleotide polymorphism marker development in the rubber tree, Hevea brasiliensis (Euphorbiaceae). Am J Bot 2011, 98:e337-e338.
- [57]Triwitayakorn K, Chatkulkawin P, Kanjanawattanawong S, Sraphet S, Yoocha T, Sangsrakru D, Chanprasert J, Ngamphiw C, Jomchai N, Therawattanasuk K, Tangphatsornruang S: Transcriptome sequencing of Hevea brasiliensis for development of microsatellite markers and construction of a genetic linkage map. DNA Res 2011, 18:471-482.
- [58]Li D, Deng Z, Qin B, Liu X, Men Z: De novo assembly and characterization of bark transcriptome using Illumina sequencing and development of EST-SSR markers in rubber tree (Hevea brasiliensis Muell. Arg.). BMC Genomics 2012, 13:192. BioMed Central Full Text
- [59]Duan C, Argout X, Gébelin V, Summo M, Dufayard JF, Leclercq J, Hadi K, Piyatrakul P, Pirrello J, Rio M, Champion A, Montoro P: Identification of the Hevea brasiliensis AP2/ERF superfamily by RNA sequencing. BMC Genomics 2013, 14:30. BioMed Central Full Text
- [60]Rahman AY, Usharraj AO, Misra BB, Thottathil GP, Jayasekaran K, Feng Y, Hou S, Ong SY, Ng FL, Lee LS, Tan HS, Muhd Sakaff MKL, Teh BS, Khoo BF, Badai SS, Ab Aziz N, Yuryev A, Knudsen B, Dionne-Laporte A, Mchunu NP, Yu Q, Langston BJ, Freitas TAK, Young AG, Chen R, Wang L, Najimudin N, Saito JA, Alam M: Draft genome sequence of the rubber tree Hevea brasiliensis. BMC Genomics 2013, 14:75. BioMed Central Full Text
- [61]Qin Y, Shi F, Tang C: Molecular characterization and expression analysis of cDNAs encoding four Rab and two Arf GTPases in the latex of Hevea brasiliensis. Plant Physiol Biochem 2011, 49:729-737.
- [62]Ruderman S, Kongsawadworakul P, Viboonjun U, Mongkolporn O, Chrestin H: Mitochondrial/Cytosolic Acetyl CoA and rubber biosynthesis genes expression in Hevea brasiliensis latex and rubber yield. Kasetsart J (Nat Sci) 2012, 46:346-362.
- [63]Piyatrakul P, Putranto RA, Martin F, Rio M, Dessailly F, Leclercq J, Dufayard JF, Lardet L, Montoro P: Some ethylene biosynthesis and AP2/ERF genes reveal a specific pattern of expression during somatic embryogenesis in Hevea brasiliensis. BMC Plant Biol 2012, 12:244. BioMed Central Full Text
- [64]Putranto RA, Sanier C, Leclercq J, Duan C, Rio M, Jourdan C, Thaler P, Sabau X, Argout X, Montoro P: Differential gene expression in different types of Hevea brasiliensis roots. Plant Sci 2012, 183:149-158.
- [65]Wang Z, Fang B, Chen J, Zhang X, Luo Z, Huang L, Chen X, Li Y: De novo assembly and characterization of root transcriptome using Illumina paired-end sequencing and development of cSSR markers in sweetpotato (Ipomoea batatas). BMC Genomics 2010, 11:726. BioMed Central Full Text
- [66]Parra G, Bradnam K, Ning Z, Keane T, Korf I: Assessing the gene space in draft genomes. Nucleic Acids Res 2009, 37:289-297.
- [67]Kush A, Goyvaerts E, Chye ML, Chua NH: Laticifer-specific gene expression in Hevea brasiliensis (rubber tree). P Natl Acad Sci USA 1990, 87:1787-1790.
- [68]MacKenzie DJ, McLean MA, Mukerji S, Green M: Improved RNA extraction from woody plants for the detection of viral pathogens by reverse transcription-polymerase chain reaction. Plant Dis 1997, 81:222-226.
- [69]Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215:403-410.
- [70]Slater GS, Birney E: Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 2005, 6:31. BioMed Central Full Text