期刊论文详细信息
PLoS One
HMMSplicer: A Tool for Efficient and Sensitive Discovery of Known and Novel Splice Junctions in RNA-Seq Data
Katherine Sorber1  Michelle T. Dimon2  Joseph L. DeRisi2 
[1] Biological and Medical Informatics Program, University of California San Francisco, San Francisco, California, United States of America;Department of Biochemistry and Biophysics, University of California San Francisco, San Francisco, California, United States of America
关键词: Introns;    Hidden Markov models;    Sequence alignment;    Sequence motif analysis;    Arabidopsis thaliana;    Multiple alignment calculation;    Alternative splicing;    Plasmodium;   
DOI  :  10.1371/journal.pone.0013875
学科分类:医学(综合)
来源: Public Library of Science
PDF
【 摘 要 】

Background High-throughput sequencing of an organism's transcriptome, or RNA-Seq, is a valuable and versatile new strategy for capturing snapshots of gene expression. However, transcriptome sequencing creates a new class of alignment problem: mapping short reads that span exon-exon junctions back to the reference genome, especially in the case where a splice junction is previously unknown.Methodology/Principal Findings Here we introduce HMMSplicer, an accurate and efficient algorithm for discovering canonical and non-canonical splice junctions in short read datasets. HMMSplicer identifies more splice junctions than currently available algorithms when tested on publicly available A. thaliana, P. falciparum, and H. sapiens datasets without a reduction in specificity.Conclusions/Significance HMMSplicer was found to perform especially well in compact genomes and on genes with low expression levels, alternative splice isoforms, or non-canonical splice junctions. Because HHMSplicer does not rely on pre-built gene models, the products of inexact splicing are also detected. For H. sapiens, we find 3.6% of 3′ splice sites and 1.4% of 5′ splice sites are inexact, typically differing by 3 bases in either direction. In addition, HMMSplicer provides a score for every predicted junction allowing the user to set a threshold to tune false positive rates depending on the needs of the experiment. HMMSplicer is implemented in Python. Code and documentation are freely available at http://derisilab.ucsf.edu/software/hmmsplicer.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO201904023959749ZK.pdf 1101KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:13次