期刊论文详细信息
BMC Genomics
A robust (re-)annotation approach to generate unbiased mapping references for RNA-seq-based analyses of differential expression across closely related species
Methodology Article
Alistair P. McGregor1  Isabel Almudi2  Montserrat Torres-Oliva3  Nico Posnien3 
[1] Department of Biological and Medical Sciences, Oxford Brookes University, Gipsy Lane, OX3 0BP, Oxford, UK;Department of Biological and Medical Sciences, Oxford Brookes University, Gipsy Lane, OX3 0BP, Oxford, UK;Andalusian Centre of Developmental Biology, carretera de Utrera, km.1, 41013, Seville, Spain;Georg-August-Universität Göttingen, Johann-Friedrich-Blumenbach-Institut für Zoologie und Anthropologie, Abteilung für Entwicklungsbiologie, GZMB Ernst-Caspari-Haus, Justus-von-Liebig-Weg 11, 37077, Göttingen, Germany;Göttingen Center for Molecular Biosciences (GZMB), GZMB Ernst-Caspari-Haus, Justus-von-Liebig-Weg 11, 37077, Göttingen, Germany;
关键词: RNA-seq;    Annotation;    Differential gene expression;    EXONERATE;    Drosophila;    Closely related species;    Emerging model systems;    RPKM;    DESeq2;    Voom;    Limma;    Length bias;   
DOI  :  10.1186/s12864-016-2646-x
 received in 2015-10-23, accepted in 2016-04-22,  发布年份 2016
来源: Springer
PDF
【 摘 要 】

BackgroundRNA-seq based on short reads generated by next generation sequencing technologies has become the main approach to study differential gene expression. Until now, the main applications of this technique have been to study the variation of gene expression in a whole organism, tissue or cell type under different conditions or at different developmental stages. However, RNA-seq also has a great potential to be used in evolutionary studies to investigate gene expression divergence in closely related species.ResultsWe show that the published genomes and annotations of the three closely related Drosophila species D. melanogaster, D. simulans and D. mauritiana have limitations for inter-specific gene expression studies. This is due to missing gene models in at least one of the genome annotations, unclear orthology assignments and significant gene length differences in the different species. A comprehensive evaluation of four statistical frameworks (DESeq2, DESeq2 with length correction, RPKM-limma and RPKM-voom-limma) shows that none of these methods sufficiently accounts for inter-specific gene length differences, which inevitably results in false positive candidate genes. We propose that published reference genomes should be re-annotated before using them as references for RNA-seq experiments to include as many genes as possible and to account for a potential length bias. We present a straight-forward reciprocal re-annotation pipeline that allows to reliably compare the expression for nearly all genes annotated in D. melanogaster.ConclusionsWe conclude that our reciprocal re-annotation of previously published genomes facilitates the analysis of significantly more genes in an inter-specific differential gene expression study. We propose that the established pipeline can easily be applied to re-annotate other genomes of closely related animals and plants to improve comparative expression analyses.

【 授权许可】

CC BY   
© Torres-Oliva et al. 2016

【 预 览 】
附件列表
Files Size Format View
RO202311096748468ZK.pdf 1277KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]
  • [56]
  • [57]
  • [58]
  • [59]
  • [60]
  • [61]
  • [62]
  • [63]
  • [64]
  • [65]
  • [66]
  • [67]
  • [68]
  • [69]
  • [70]
  • [71]
  • [72]
  • [73]
  • [74]
  文献评价指标  
  下载次数:0次 浏览次数:1次