BMC Genomics | |
XSAnno: a framework for building ortholog models in cross-species transcriptome comparisons | |
Nenad Šestan1  André MM Sousa2  Mingfeng Li1  Ying Zhu1  | |
[1] Department of Neurobiology, Kavli Institute for Neuroscience, Yale School of Medicine, 06510 New Haven, CT, USA;Graduate Program in Areas of Basic and Applied Biology, Abel Salazar Biomedical Sciences Institute, University of Porto, 4099-003 Porto, Portugal | |
关键词: Chimpanzee; Macaque; Primate; Human evolution; Evolution; Prefrontal cortex; Gene expression; RNA-seq; Ortholog annotation; Comparative transcriptomics; | |
Others : 1217255 DOI : 10.1186/1471-2164-15-343 |
|
received in 2014-01-17, accepted in 2014-04-24, 发布年份 2014 | |
【 摘 要 】
Background
The accurate characterization of RNA transcripts and expression levels across species is critical for understanding transcriptome evolution. As available RNA-seq data accumulate rapidly, there is a great demand for tools that build gene annotations for cross-species RNA-seq analysis. However, prevailing methods of ortholog annotation for RNA-seq analysis between closely-related species do not take inter-species variation in mappability into consideration.
Results
Here we present XSAnno, a computational framework that integrates previous approaches with multiple filters to improve the accuracy of inter-species transcriptome comparisons. The implementation of this approach in comparing RNA-seq data of human, chimpanzee, and rhesus macaque brain transcriptomes has reduced the false discovery of differentially expressed genes, while maintaining a low false negative rate.
Conclusion
The present study demonstrates the utility of the XSAnno pipeline in building ortholog annotations and improving the accuracy of cross-species transcriptome comparisons.
【 授权许可】
2014 Zhu et al.; licensee BioMed Central Ltd.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150705223726344.pdf | 1445KB | download | |
Figure 4. | 117KB | Image | download |
Figure 3. | 166KB | Image | download |
Figure 2. | 117KB | Image | download |
Figure 1. | 163KB | Image | download |
【 图 表 】
Figure 1.
Figure 2.
Figure 3.
Figure 4.
【 参考文献 】
- [1]King MC, Wilson AC: Evolution at two levels in humans and chimpanzees. Science 1975, 188:107-116.
- [2]Romero IG, Ruvinsky I, Gilad Y: Comparative studies of gene expression and the evolution of gene regulation. Nat Rev Genet 2012, 13:505-516.
- [3]Soon WW, Hariharan M, Snyder MP: High-throughput sequencing for biology and medicine. Mol Syst Biol 2013, 9:640.
- [4]Flicek P, Amode MR, Barrell D, Beal K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fairley S, Fitzgerald S, Gil L, Gordon L, Hendrix M, Hourlier T, Johnson N, Kahari AK, Keefe D, Keenan S, Kinsella R, Komorowska M, Koscielny G, Kulesha E, Larsson P, Longden I, McLaren W, Muffato M, Overduin B, Pignatelli M, Pritchard B, Riat HS: Ensembl 2012. Nucleic Acids Res 2012, 40:D84-D90.
- [5]Waterhouse RM, Tegenfeldt F, Li J, Zdobnov EM, Kriventseva EV: OrthoDB: a hierarchical catalog of animal, fungal and bacterial orthologs. Nucleic Acids Res 2013, 41:D358-D365.
- [6]Powell S, Szklarczyk D, Trachana K, Roth A, Kuhn M, Muller J, Arnold R, Rattei T, Letunic I, Doerks T, Jensen LJ, von Mering C, Bork P, Jensen LJ, von Mering C, Bork P: eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Res 2012, 40:D284-D289.
- [7]Xu AG, He L, Li Z, Xu Y, Li M, Fu X, Yan Z, Yuan Y, Menzel C, Li N, Somel M, Hu H, Chen W, Paabo S, Khaitovich P: Intergenic and repeat transcription in human, chimpanzee and macaque brains measured by RNA-Seq. PLoS Comput Biol 2010, 6:e1000843.
- [8]Liu S, Lin L, Jiang P, Wang D, Xing Y: A comparison of RNA-Seq and high-density exon array for detecting differential gene expression between closely related species. Nucleic Acids Res 2011, 39:578-588.
- [9]Merkin J, Russell C, Chen P, Burge CB: Evolutionary dynamics of gene and isoform regulation in Mammalian tissues. Science 2012, 338:1593-1599.
- [10]Blekhman R, Marioni JC, Zumbo P, Stephens M, Gilad Y: Sex-specific and lineage-specific alternative splicing in primates. Genome Res 2010, 20:180-189.
- [11]Lee H, Schatz MC: Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score. Bioinformatics 2012, 28:2097-2105.
- [12]Derrien T, Estelle J, Marco Sola S, Knowles DG, Raineri E, Guigo R, Ribeca P: Fast computation and applications of genome mappability. PLoS One 2012, 7:e30377.
- [13]Anders S, Huber W: Differential expression analysis for sequence count data. Genome Biol 2010, 11:R106. BioMed Central Full Text
- [14]Kuhn RM, Haussler D, Kent WJ: The UCSC genome browser and associated tools. Brief Bioinform 2013, 14:144-161.
- [15]Kent WJ: BLAT–the BLAST-like alignment tool. Genome Res 2002, 12:656-664.
- [16]Massingham T, Goldman N: simNGS and simlibrary – software for simulating next-gen sequencing data. 2012. http://www.ebi.ac.uk/goldman-srv/simNGS/ webcite
- [17]Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C: Initial sequencing and analysis of the human genome. Nature 2001, 409:860-921.
- [18]Chimpanzee Sequencing Analysis Consortium: Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 2005, 437:69-87.
- [19]Trapnell C, Pachter L, Salzberg SL: TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 2009, 25:1105-1111.
- [20]Rhesus Macaque Genome Sequencing Consortium: Evolutionary and biomedical insights from the rhesus macaque genome. Science 2007, 316:222-234.
- [21]R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing; 2013. URL: http://www.R-project.org/ webcite
- [22]Habegger L, Sboner A, Gianoulis TA, Rozowsky J, Agarwal A, Snyder M, Gerstein M: RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries. Bioinformatics 2011, 27:281-283.
- [23]Kang HJ, Kawasawa YI, Cheng F, Zhu Y, Xu X, Li M, Sousa AM, Pletikos M, Meyer KA, Sedmak G, Guennel T, Shin Y, Johnson MB, Krsnik Z, Mayer S, Fertuzinhos S, Umlauf S, Lisgo SN, Vortmeyer A, Weinberger DR, Mane S, Hyde TM, Huttner A, Reimers M, Kleinman JE, Sestan N: Spatio-temporal transcriptome of the human brain. Nature 2011, 478:483-489.
- [24]Pletikos M, Sousa AM, Sedmak G, Meyer KA, Zhu Y, Cheng F, Li M, Kawasawa YI, Sestan N: Temporal specification and bilaterality of human neocortical topographic gene expression. Neuron 2014, 81(2):321-332.
- [25]Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B: Synthetic spike-in standards for RNA-seq experiments. Genome research 2011, 21(9):1543-1551.
- [26]Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 2008, 5:621-628.