期刊论文详细信息
PeerJ
Uncovering full-length transcript isoforms of sugarcane cultivar Khon Kaen 3 using single-molecule long-read sequencing
article
Jittima Piriyapongsa1  Pavita Kaewprommal1  Sirintra Vaiwsri1  Songtham Anuntakarun1  Warodom Wirojsirasak2  Prapat Punpee2  Peeraya Klomsa-ard2  Philip J. Shaw1  Wirulda Pootakham1  Thippawan Yoocha1  Duangjai Sangsrakru1  Sithichoke Tangphatsornruang1  Sissades Tongsima1  Somvong Tragoonrung1 
[1] National Center for Genetic Engineering and Biotechnology ,(BIOTEC), National Science and Technology Development Agency;Mitr Phol Sugarcane Research Center Co., Ltd.
关键词: Sugarcane;    Full-length transcripts;    Transcriptome;    Single-molecule long-read sequencing;    Iso-Seq;    PacBio sequencing;    Khon Kaen 3;    KK3;   
DOI  :  10.7717/peerj.5818
学科分类:社会科学、人文和艺术(综合)
来源: Inra
PDF
【 摘 要 】

BackgroundSugarcane is an important global food crop and energy resource. To facilitate the sugarcane improvement program, genome and gene information are important for studying traits at the molecular level. Most currently available transcriptome data for sugarcane were generated using second-generation sequencing platforms, which provide short reads. The de novo assembled transcripts from these data are limited in length, and hence may be incomplete and inaccurate, especially for long RNAs.MethodsWe generated a transcriptome dataset of leaf tissue from a commercial Thai sugarcane cultivar Khon Kaen 3 (KK3) using PacBio RS II single-molecule long-read sequencing by the Iso-Seq method. Short-read RNA-Seq data were generated from the same RNA sample using the Ion Proton platform for reducing base calling errors.ResultsA total of 119,339 error-corrected transcripts were generated with the N50 length of 3,611 bp, which is on average longer than any previously reported sugarcane transcriptome dataset. 110,253 sequences (92.4%) contain an open reading frame (ORF) of at least 300 bp long with ORF N50 of 1,416 bp. The mean lengths of 5′ and 3′ untranslated regions in 73,795 sequences with complete ORFs are 1,249 and 1,187 bp, respectively. 4,774 transcripts are putatively novel full-length transcripts which do not match with a previous Iso-Seq study of sugarcane. We annotated the functions of 68,962 putative full-length transcripts with at least 90% coverage when compared with homologous protein coding sequences in other plants.DiscussionThe new catalog of transcripts will be useful for genome annotation, identification of splicing variants, SNP identification, and other research pertaining to the sugarcane improvement program. The putatively novel transcripts suggest unique features of KK3, although more data from different tissues and stages of development are needed to establish a reference transcriptome of this cultivar.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202307100011520ZK.pdf 3144KB PDF download
  文献评价指标  
  下载次数:1次 浏览次数:0次