期刊论文详细信息
BMC Genomics
Identifying similar transcripts in a related organism from de Bruijn graphs of RNA-Seq data, with applications to the study of salt and waterlogging tolerance in Melilotus
Natasha L. Teakle1  Shuhua Fu2  Sing-Hoi Sze2  Aaron M. Tarone3  Maren L. Friesen4  Peter L. Chang4 
[1] Centre for Ecohydrology, The University of Western Australia;Department of Biochemistry & Biophysics, Texas A&M University;Department of Entomology, Texas A&M University;Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California;
关键词: de Bruijn graph;    RNA-Seq;    Melilotus;   
DOI  :  10.1186/s12864-019-5702-5
来源: DOAJ
【 摘 要 】

Abstract Background A popular strategy to study alternative splicing in non-model organisms starts from sequencing the entire transcriptome, then assembling the reads by using de novo transcriptome assembly algorithms to obtain predicted transcripts. A similarity search algorithm is then applied to a related organism to infer possible function of these predicted transcripts. While some of these predictions may be inaccurate and transcripts with low coverage are often missed, we observe that it is possible to obtain a more complete set of transcripts to facilitate possible functional assignments by starting the search from the intermediate de Bruijn graph that contains all branching possibilities. Results We develop an algorithm to extract similar transcripts in a related organism by starting the search from the de Bruijn graph that represents the transcriptome instead of from predicted transcripts. We show that our algorithm is able to recover more similar transcripts than existing algorithms, with large improvements in obtaining longer transcripts and a finer resolution of isoforms. We apply our algorithm to study salt and waterlogging tolerance in two Melilotus species by constructing new RNA-Seq libraries. Conclusions We have developed an algorithm to identify paths in the de Bruijn graph that correspond to similar transcripts in a related organism directly. Our strategy bypasses the transcript prediction step in RNA-Seq data and makes use of support from evolutionary information.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:8次