期刊论文

【摘要】

Background

Deep sequencing technology provides efficient and economical production of large numbers of randomly positioned, relatively short, estimates of base identities in DNA molecules. Application of this technology to mRNA samples allows rapid examination of the molecular genetic environment in individual cells or tissues, the transcriptome. However, assembly of such short sequences into complete mRNA creates a challenge that limits the usefulness of the technology, particularly when no, or limited, genomic data is available. Several approaches to this problem have been developed, but there is still no general method to rapidly obtain an mRNA sequence from deep sequence data when a specific molecule, or family of molecules, are of interest. A frequent requirement is to identify specific mRNA molecules from tissues that are being investigated by methods such as electrophysiology, immunocytology and pharmacology. To be widely useful, any approach must be relatively simple to use in the laboratory by operators without extensive statistical or bioinformatics knowledge, and with readily available hardware.

Findings

An approach was developed that allows de novo assembly of individual mRNA sequences in two linked stages: sequence discovery and sequence completion. Both stages rely on computer assisted, Graphical User Interface (GUI)-guided, user interaction with the data, but proceed relatively efficiently once discovery is complete. The method grows a discovered sequence by repeated passes through the complete raw data in a series of steps, and is hence termed ‘transcriptome walking’. All of the operations required for transcriptome analysis are combined in one program that presents a relatively simple user interface and runs on a standard desktop, or laptop computer, but takes advantage of multi-core processors, when available. Complete mRNA sequence identifications usually require less than 24 hours. This approach has already identified previously unknown mRNA sequences in two animal species that currently lack any significant genome or transcriptome data.

Conclusions

As deep sequencing data becomes more widely available, accessible methods for extracting useful sequence information in the biological or medical laboratory will be of increasing importance. The approach described here does not rely on detailed knowledge of bioinformatic algorithms, and allows users with basic knowledge of molecular biology and standard laboratory computing equipment, but limited software or bioinformatics experience, to extract complete gene sequences from deep-sequencing data.

【授权许可】

2012 French; licensee BioMed Central Ltd.

【预览】

附件列表
Files	Size	Format	View
20150416023546593.pdf	1979KB	PDF	download
Figure 5.	82KB	Image	download
Figure 4.	69KB	Image	download
Figure 3.	150KB	Image	download
Figure 2.	81KB	Image	download
Figure 1.	133KB	Image	download

【图表】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

【参考文献】

[1]Sequence.exe, a complete, self-unpacking Windows installer package is available as Additional file 1, and the latest version of the package is available from the project homepage athttp://asf-pht.medicine.dal.ca/Downloads/Sequence.exe webcite
[2]Torkkeli PH, Panek I, Meisner S: Ca2+/calmodulin-dependent protein kinase II mediates the octopamine-induced increase in sensitivity in spider VS-3 mechanosensory neurons. Eur J Neurosci 2011, 33:1186-1196.
[3]Lees K, Woods DJ, Bowman AS: Transcriptome analysis of the synganglion from the brown dog tick. Rhipicephalus sanguineus. Insect Mol Biol 2010, 19:273-282.
[4]Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L: Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 2010, 28:511-515.
[5]Wang Z, Gerstein M, Snyder M: RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 2009, 10:57-63.
[6]Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A: Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat Biotechnol 2011, 15:644-652.
[7]Earl D, Bradnam K, St John J, Darling A, Lin D, Fass J, Yu HO, Buffalo V, Zerbino DR, Diekhans M, et al.: Assemblathon 1: A competitive assessment of de novo short read assembly methods. Genome Res 2011, 21:2224-2241.
[8]Zhao QY, Wang Y, Kong YM, Luo D, Li X, Hao P: Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study. BMC Bioinformatics 2011, 12(Suppl 14):S2. BioMed Central Full Text
[9]Zerbino DR, Birney E: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008, 18:821-829.
[10]French AS, Torkkeli PH, Seyfarth E-A: From stress and strain to spikes: mechanotransduction in spider slit sensilla. J Comp Physiol A 2002, 188:739-752.
[11]Chaisson MJ, Brinza D, Pevzner PA: De novo fragment assembly with short mate-paired reads: does the read length matter? Genome Res 2009, 19:336-346.
[12]Pfeiffer K, Torkkeli PH, French AS: Activation of GABAA receptors modulates all stages of mechanoreception in spider mechanosensory neurons. J Neurophysiol 2012, 107:196-204.

BMC Research Notes
Transcriptome walking: a laboratory-oriented GUI-based approach to mRNA identification from deep-sequenced data

Andrew S French¹
[1] Department of Physiology and Biophysics, Dalhousie University, PO BOX 15000, Halifax, NS B3H 4R2, Canada
关键词: Bioinformatics; Transcriptome; GUI; Assembly; Deep-sequencing;
Others : 1165053 DOI : 10.1186/1756-0500-5-673

received in 2012-09-14, accepted in 2012-11-22, 发布年份 2012
PDF


	文献评价指标
	下载次数：86次	浏览次数：24次

【 摘 要 】

Background

Findings

Conclusions

【 授权许可】

【 预 览 】

【 图 表 】

【 参考文献 】

【摘要】

【授权许可】

【预览】

【图表】

【参考文献】