期刊论文详细信息
BMC Genomics
Improved annotation with de novo transcriptome assembly in four social amoeba species
Research Article
Pauline Schaap1  Christina Schilde1  Hajara M. Lawal1  Geoffrey J. Barton2  Christian Cole2  Reema Singh3  Gernot Glöckner4 
[1] Cell and Development Biology, School of Life Sciences, University of Dundee, Dow Street, Dundee, UK;Computational Biology, School of Life Sciences, University of Dundee, Dow Street, Dundee, UK;Computational Biology, School of Life Sciences, University of Dundee, Dow Street, Dundee, UK;Cell and Development Biology, School of Life Sciences, University of Dundee, Dow Street, Dundee, UK;Institute of Biochemistry I, Medical Faculty, University of Cologne, D-50931, Cologne, Germany;Leibniz Institute of Freshwater Ecology and Inland Fisheries (IGB), Müggelseedamm 301, D-12587, Berlin, Germany;
关键词: Dictyostelia;    Social amoeba;    De novo;    Transcriptome assembly;    RNA-seq;   
DOI  :  10.1186/s12864-017-3505-0
 received in 2016-05-21, accepted in 2017-01-14,  发布年份 2017
来源: Springer
PDF
【 摘 要 】

BackgroundAnnotation of gene models and transcripts is a fundamental step in genome sequencing projects. Often this is performed with automated prediction pipelines, which can miss complex and atypical genes or transcripts. RNA sequencing (RNA-seq) data can aid the annotation with empirical data. Here we present de novo transcriptome assemblies generated from RNA-seq data in four Dictyostelid species: D. discoideum, P. pallidum, D. fasciculatum and D. lacteum. The assemblies were incorporated with existing gene models to determine corrections and improvement on a whole-genome scale. This is the first time this has been performed in these eukaryotic species.ResultsAn initial de novo transcriptome assembly was generated by Trinity for each species and then refined with Program to Assemble Spliced Alignments (PASA). The completeness and quality were assessed with the Benchmarking Universal Single-Copy Orthologs (BUSCO) and Transrate tools at each stage of the assemblies. The final datasets of 11,315-12,849 transcripts contained 5,610-7,712 updates and corrections to >50% of existing gene models including changes to hundreds or thousands of protein products. Putative novel genes are also identified and alternative splice isoforms were observed for the first time in P. pallidum, D. lacteum and D. fasciculatum.ConclusionsIn taking a whole transcriptome approach to genome annotation with empirical data we have been able to enrich the annotations of four existing genome sequencing projects. In doing so we have identified updates to the majority of the gene annotations across all four species under study and found putative novel genes and transcripts which could be worthy for follow-up. The new transcriptome data we present here will be a valuable resource for genome curators in the Dictyostelia and we propose this effective methodology for use in other genome annotation projects.

【 授权许可】

CC BY   
© The Author(s). 2017

【 预 览 】
附件列表
Files Size Format View
RO202311096086290ZK.pdf 2177KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  文献评价指标  
  下载次数:7次 浏览次数:0次