BMC Genomics | |
Evolutionary insights from de novo transcriptome assembly and SNP discovery in California white oaks | |
Victoria L. Sork1  Paul F. Gugger2  Shawn J. Cokus3  | |
[1] Institute of the Environment and Sustainability, University of California, 300 La Kretz Hall, 619 Charles E. Young Drive East, Los Angeles 90095-1496, CA, USA;Ecology and Evolutionary Biology, University of California, 4140 Terasaki Life Sciences Building, 610 Charles E. Young Drive East, Los Angeles 90095-7239, CA, USA;Molecular, Cell, and Developmental Biology, University of California, 3000 Terasaki Life Sciences Building, 610 Charles E. Young Drive East, Los Angeles 90095-7239, CA, USA | |
关键词: Transcriptome; Single-nucleotide polymorphism; RNA-Seq; Quercus lobata; Quercus garryana; Quercus douglasii; d N/d S; Divergence; De novo assembly; Annotation; | |
Others : 1221852 DOI : 10.1186/s12864-015-1761-4 |
|
received in 2014-11-10, accepted in 2015-07-07, 发布年份 2015 | |
【 摘 要 】
Background
Reference transcriptomes provide valuable resources for understanding evolution within and among species. We de novo assembled and annotated a reference transcriptome for Quercus lobata and Q. garryana and identified single-nucleotide polymorphisms (SNPs) to provide resources for forest genomicists studying this ecologically and economically important genus. We further performed preliminary analyses of genes important in interspecific divergent (positive) selection that might explain ecological differences among species, estimating rates of nonsynonymous to synonymous substitutions (dN /dS ) and Fay and Wu’s H. Functional classes of genes were tested for unusually high dN /dSor low H consistent with divergent positive selection.
Results
Our draft transcriptome is among the most complete for oaks, including 83,644 contigs (23,329 ≥ 1 kbp), 14,898 complete and 13,778 partial gene models, and functional annotations for 9,431 Arabidopsis orthologs and 19,365 contigs with Pfam hits. We identified 1.7 million possible sequence variants including 1.1 million high-quality diallelic SNPs — among the largest sets identified in any tree. 11 of 18 functional categories with significantly elevated dN /dSare involved in disease response, including 50+ genes with dN /dS > 1. Other high-dN /dSgenes are involved in biotic response, flowering and growth, or regulatory processes. In contrast, median dN /dSwas low (0.22), suggesting that purifying selection influences most genes. No functional categories have unusually low H.
Conclusions
These results offer preliminary support for the hypothesis that divergent selection at pathogen resistance are important factors in species divergence in these hybridizing California oaks. Our transcriptome provides a solid foundation for future studies of gene expression, natural selection, and speciation in Quercus.
【 授权许可】
2015 Cokus et al.
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
20150804031049209.pdf | 2023KB | download | |
Fig. 2. | 35KB | Image | download |
Fig. 1. | 92KB | Image | download |
【 图 表 】
Fig. 1.
Fig. 2.
【 参考文献 】
- [1]Tuskan GA, DiFazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A et al.. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science. 2006; 313:1596-1604.
- [2]Bao Y, Xu S, Jing X, Meng L, Qin Z. De novo assembly and characterization of Oryza officinalis leaf transcriptome by using RNA-Seq. Biomed Res Int. 2015; 2015:7.
- [3]Sierro N, Battey J, Ouadi S, Bovet L, Goepfert S, Bakaher N, Peitsch M, Ivanov N. Reference genomes and transcriptomes of Nicotiana sylvestris and Nicotiana tomentosiformis. Genome Biol. 2013; 14:R60.
- [4]Kremer A, Abbott AG, Carlson JE, Manos PS, Plomion C, Sisco P, Staton ME, Ueno S, Vendramin GG. Genomics of Fagaceae. Tree Genet Genomes. 2012; 8:583-610.
- [5]Ueno S, Le Provost G, Leger V, Klopp C, Noirot C, Frigerio J-M, Salin F, Salse J, Abrouk M, Murat F et al.. Bioinformatic analysis of ESTs collected by Sanger and pyrosequencing methods for a keystone forest tree species: oak. BMC Genomics. 2010; 11:650.
- [6]Durand J, Bodenes C, Chancerel E, Frigerio J-M, Vendramin G, Sebastiani F, Buonamici A, Gailing O, Koelewijn H-P, Villani F et al.. A fast and cost-effective approach to develop and map EST-SSR markers: oak as a case study. BMC Genomics. 2010; 11:570.
- [7]Bodénès C, Chancerel E, Gailing O, Vendramin GG, Bagnoli F, Durand J, Goicoechea PG, Soliani C, Villani F, Mattioni C et al.. Comparative mapping in the Fagaceae and beyond with EST-SSRs. BMC Plant Biol. 2012; 12:153.
- [8]Tarkka MT, Herrmann S, Wubet T, Feldhahn L, Recht S, Kurth F, Mailänder S, Bönn M, Neef M, Angay O et al.. OakContigDF159.1, a reference library for studying differential gene expression in Quercus robur during controlled biotic interactions: use for quantitative transcriptomic profiling of oak roots in ectomycorrhizal symbiosis. New Phytol. 2013; 199:529-540.
- [9]Fagaceae Genomics Web. http://www. fagaceae.org/ webcite
- [10]Cánovas A, Rincon G, Islas-Trejo A, Wickramasinghe S, Medrano J. SNP discovery in the bovine milk transcriptome using RNA-Seq technology. Mamm Genome. 2010; 21:592-598.
- [11]Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011. doi:10.1038/nbt.1883.
- [12]Pearse IS, Hipp AL. Phylogenetic and trait similarity to a native species predict herbivory on non-native oaks. Proc Natl Acad Sci U S A. 2009; 106:18097-18102.
- [13]Craft KJ, Ashley MV, Koenig WD. Limited hybridization between Quercus lobata and Quercus douglasii (Fagaceae) in a mixed stand in central coastal California. Am J Bot. 2002; 89:1792-1798.
- [14]Nixon KC, Muller CH. Quercus Linnaeus Sect. Quercus White Oaks. In: Flora of North America North of Mexico. Oxford University Press, New York; 1997: p.436-506.
- [15]Burns RM, Honkala BH. Silvics of North America: Hardwoods. U.S. Department of Agriculture Forest Service, Washington, DC; 1990.
- [16]Rieseberg LH, Widmer A, Arntz AM, Burke JM. Directional selection is the primary cause of phenotypic diversification. Proc Natl Acad Sci. 2002; 99:12242-12245.
- [17]Seehausen O, Butlin RK, Keller I, Wagner CE, Boughman JW, Hohenlohe PA, Peichel CL, Saetre G-P, Bank C, Brannstrom A et al.. Genomics and the origin of species. Nat Rev Genet. 2014; 15:176-192.
- [18]Hoekstra HE, Hoekstra JM, Berrigan D, Vignieri SN, Hoang A, Hill CE, Beerli P, Kingsolver JG. Strength and tempo of directional selection in the wild. Proc Natl Acad Sci. 2001; 98:9157-9160.
- [19]Slatkin M. Gene flow in natural populations. Annu Rev Ecol Syst. 1985; 16:393-430.
- [20]Muhlfeld CC, Kovach RP, Jones LA, Al-Chokhachy R, Boyer MC, Leary RF, Lowe WH, Luikart G, Allendorf FW. Invasive hybridization in a threatened species is accelerated by climate change. Nat Clim Chang. 2014; 4:620-624.
- [21]Fitzpatrick BM, Johnson JR, Kump DK, Smith JJ, Voss SR, Shaffer HB. Rapid spread of invasive genes into a threatened native species. Proc Natl Acad Sci. 2010; 107:3606-3610.
- [22]Abbott R, Albach D, Ansell S, Arntzen JW, Baird SJE, Bierne N, Boughman J, Brelsford A, Buerkle CA, Buggs R et al.. Hybridization and speciation. J Evol Biol. 2013; 26:229-246.
- [23]Rieseberg LH, Raymond O, Rosenthal DM, Lai Z, Livingstone K, Nakazato T, Durphy JL, Schwarzbach AE, Donovan LA, Lexer C. Major ecological transitions in wild sunflowers facilitated by hybridization. Science. 2003; 301:1211-1216.
- [24]Lexer C, Fay MF. Adaptation to environmental stress: a rare or frequent driver of speciation? J Evol Biol. 2005; 18:893-900.
- [25]Strasburg JL, Sherman NA, Wright KM, Moyle LC, Willis JH, Rieseberg LH. What can patterns of differentiation across plant genomes tell us about adaptation and speciation? Philos Trans R Soc Lond B Biol Sci. 2012; 367:364-373.
- [26]Yang ZH, Bielawski JP. Statistical methods for detecting molecular adaptation. Trends Ecol Evol. 2000; 15:496-503.
- [27]Muller CH. Ecological control of hybridization in Quercus: a factor in the mechanism of evolution. Evolution. 1952; 6:147-161.
- [28]Van Valen L. Ecological species, multispecies, and oaks. Taxon. 1976; 25:233-239.
- [29]Cavender-Bares J, Pahlich A. Molecular, morphological and ecological niche differentiation of sympatric sister oak species, Quercus virginiana and Q. geminata (Fagaceae). Am J Bot. 2009; 96:1690-1702.
- [30]Gailing O, Curtu AL. Interspecific gene flow and maintenance of species integrity in oaks. Ann For Res. 2014; 57:5-18.
- [31]Goicoechea PG, Petit RJ, Kremer A. Detecting the footprints of divergent selection in oaks with linked markers. Heredity. 2012; 109:361-371.
- [32]Whittemore AT, Schaal BA. Interspecific gene flow in sympatric oaks. Proc Natl Acad Sci U S A. 1991; 88:2540-2544.
- [33]Jones P, Binns D, Chang H-Y, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G et al.. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014; 30:1236-1240.
- [34]Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J et al.. Pfam: the protein families database. Nucleic Acids Res. 2014; 42:D222-D230.
- [35]Boisvert S, Laviolette F, Corbeil J. Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. J Comput Biol. 2010; 17:1519-1533.
- [36]Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999; 27:573-580.
- [37]Kremer A, Casasoli M, Barreneche T, Bodenes C, Sisco P, Kubisiak T, Scalfi M, Leonardi S, Bakker E, Buiteveld J et al.. Fagaceae Trees. In: Genome Mapping and Molecular Breeding in Plants, Volume 7, Forest Trees. Volume 7. Kole C, editor. Springer, New York; 2007: p.161.
- [38]Swarbreck D, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H, Li D, Meyer T, Muller R, Ploetz L et al.. The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 2008; 36:D1009-D1014.
- [39]Initiative TAG. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000; 408:796-815.
- [40]Gugger PF, Cavender-Bares J. Molecular and morphological support for a Florida origin of the Cuban oak. J Biogeogr. 2013; 40:632-645.
- [41]Gugger PF, Ikegami M, Sork VL. Influence of late Quaternary climate change on present patterns of genetic variation in valley oak, Quercus lobata Née. Mol Ecol. 2013; 22:3598-3612.
- [42]Petit RJ, Csaikl UM, Bordács S, Burg K, Coart E, Cottrell J, van Dam B, Deans JD, Dumolin-Lapègue S, Fineschi S et al.. Chloroplast DNA variation in European white oaks: phylogeography and patterns of diversity based on data from over 2600 populations. For Ecol Manag. 2002; 156:5-26.
- [43]Šmarda P, Bureš P, Šmerda J, Horová L. Measurements of genomic GC content in plant genomes with flow cytometry: a test for reliability. New Phytol. 2012; 193:513-521.
- [44]Parchman T, Geist K, Grahnen J, Benkman C, Buerkle CA. Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery. BMC Genomics. 2010; 11:180.
- [45]Majoros WH, Pertea M, Salzberg SL. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 2004; 20:2878-2879.
- [46]Stanke M, Steinkamp R, Waack S, Morgenstern B. AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Res. 2004; 32:W309-W312.
- [47]Moissiard G, Cokus SJ, Cary J, Feng S, Billi AC, Stroud H, Husmann D, Zhan Y, Lajoie BR, McCord RP et al.. MORC family ATPases required for heterochromatin condensation and gene silencing. Science. 2012; 336:1448-1451.
- [48]Parra G, Bradnam K, Korf I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 2007; 23:1061-1067.
- [49]Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al.. Gene Ontology: tool for the unification of biology. Nat Genet. 2000; 25:25-29.
- [50]McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010; 20:1297-1303.
- [51]Pavy N, Deschênes A, Blais S, Lavigne P, Beaulieu J, Isabel N, Mackay J, Bousquet J. The landscape of nucleotide polymorphism among 13,500 genes of the conifer Picea glauca, relationships with functions, and comparison with Medicago truncatula. Genome Biol Evol. 2013; 5:1910-1925.
- [52]Muller T, Ensminger I, Schmid K. A catalogue of putative unique transcripts from Douglas-fir (Pseudotsuga menziesii) based on 454 transcriptome sequencing of genetically diverse, drought stressed seedlings. BMC Genomics. 2012; 13:673.
- [53]Geraldes A, Pang J, Thiessen N, Cezard T, Moore R, Zhao Y, Tam A, Wang S, Friedmann M, Birol I et al.. SNP discovery in black cottonwood (Populus trichocarpa) by population transcriptome resequencing. Mol Ecol Resour. 2011; 11:81-92.
- [54]Subbaiyan GK, Waters DLE, Katiyar SK, Sadananda AR, Vaddadi S, Henry RJ. Genome-wide DNA polymorphisms in elite indica rice inbreds discovered by whole-genome sequencing. Plant Biotechnol J. 2012; 10:623-634.
- [55]Gaur R, Azam S, Jeena G, Khan AW, Choudhary S, Jain M, Yadav G, Tyagi AK, Chattopadhyay D, Bhatia S. High-throughput SNP discovery and genotyping for constructing a saturated linkage map of chickpea (Cicer arietinum L.). DNA Res. 2012; 19:357-373.
- [56]Cao J, Schneeberger K, Ossowski S, Gunther T, Bender S, Fitz J, Koenig D, Lanz C, Stegle O, Lippert C et al.. Whole-genome sequencing of multiple Arabidopsis thaliana populations. Nat Genet. 2011; 43:956-963.
- [57]Mosca E, Eckert AJ, Liechty JD, Wegrzyn JL, La Porta N, Vendramin GG, Neale DB. Contrasting patterns of nucleotide diversity for four conifers of Alpine European forests. Evol Appl. 2012; 5:762-775.
- [58]Branca A, Paape TD, Zhou P, Briskine R, Farmer AD, Mudge J, Bharti AK, Woodward JE, May GD, Gentzbittel L et al.. Whole-genome nucleotide diversity, recombination, and linkage disequilibrium in the model legume Medicago truncatula. Proc Natl Acad Sci. 2011; 108:E864-E870.
- [59]Nei M, Li WH. Mathematical model for studying genetic variation in terms of restriction endonucleases. Proc Natl Acad Sci U S A. 1979; 76:5269-5273.
- [60]Begun DJ, Holloway AK, Stevens K, Hillier LW, Poh Y-P, Hahn MW, Nista PM, Jones CD, Kern AD, Dewey CN et al.. Population genomics: whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol. 2007; 5: Article ID e310
- [61]Buschiazzo E, Ritland C, Bohlmann J, Ritland K. Slow but not low: genomic comparisons reveal slower evolutionary rate and higher dN/dS in conifers compared to angiosperms. BMC Evol Biol. 2012; 12:8.
- [62]van der Biezen EA, Jones JDG. The NB-ARC domain: a novel signalling motif shared by plant resistance gene products and regulators of cell death in animals. Curr Biol. 1998; 8:R226-R228.
- [63]Van Der Biezen EA, Jones JDG. Plant disease-resistance proteins and the gene-for-gene concept. Trends Biochem Sci. 1998; 23:454-456.
- [64]Jones DA, Jones JDG. The Role of Leucine-Rich Repeat Proteins in Plant Defences. In: Advances in Botanical Research. Volume Volume 24. Callow JA, editor. Academic, San Diego; 1997: p.89-167.
- [65]Koonin EV, Aravind L. The NACHT family – a new group of predicted NTPases implicated in apoptosis and MHC transcription activation. Trends Biochem Sci. 2000; 25:223-224.
- [66]Yang S, Li J, Zhang X, Zhang Q, Huang J, Chen J-Q, Hartl DL, Tian D. Rapidly evolving R genes in diverse grass species confer resistance to rice blast disease. Proc Natl Acad Sci. 2013; 110:18572-18577.
- [67]Yang X, Kalluri UC, Jawdy S, Gunter LE, Yin T, Tschaplinski TJ, Weston DJ, Ranjan P, Tuskan GA. The F-box gene family is expanded in herbaceous annual plants relative to woody perennial plants. Plant Physiol. 2008; 148:1189-0.
- [68]Xiao S, Ellwood S, Calis O, Patrick E, Li T, Coleman M, Turner JG. Broad-spectrum mildew resistance in Arabidopsis thaliana mediated by RPW8. Science. 2001; 291:118-120.
- [69]Bergelson J, Kreitman M, Stahl EA, Tian D. Evolutionary dynamics of plant R-genes. Science. 2001; 292:2281-2285.
- [70]Wang G-L, Ruan D-L, Song W-Y, Sideris S, Chen L, Pi L-Y, Zhang S, Zhang Z, Fauquet C, Gaut BS et al.. Xa21D encodes a receptor-like molecule with a leucine-rich repeat domain that determines race-specific recognition and is subject to adaptive evolution. Plant Cell Online. 1998; 10:765-779.
- [71]Meyers BC, Shen KA, Rohani P, Gaut BS, Michelmore RW. Receptor-like genes in the major resistance locus of lettuce are subject to divergent selection. Plant Cell Online. 1998; 10:1833-1846.
- [72]Wan H, Yuan W, Bo K, Shen J, Pang X, Chen J. Genome-wide analysis of NBS-encoding disease resistance genes in Cucumis sativus and phylogenetic study of NBS-encoding genes in Cucurbitaceae crops. BMC Genomics. 2013; 14:109.
- [73]Bakker EG, Toomajian C, Kreitman M, Bergelson J. A genome-wide survey of R gene polymorphisms in Arabidopsis. Plant Cell Online. 2006; 18:1803-1818.
- [74]Complete sequence and gene map of a human major histocompatibility complex. Nature. 1999; 401:921-923.
- [75]Tiffin P, Moeller DA. Molecular evolution of plant immune system genes. Trends Genet. 2006; 22:662-670.
- [76]Richard F, Millot S, Gardes M, Selosse MA. Diversity and specificity of ectomycorrhizal fungi retrieved from an old-growth Mediterranean forest dominated by Quercus ilex. New Phytol. 2005; 166:1011-1023.
- [77]Roslin T, Laine A-L, Gripenberg S. Spatial population structure in an obligate plant pathogen colonizing oak Quercus robur. Funct Ecol. 2007; 21:1168-1177.
- [78]Abrahamson WG, Hunter MD, Melika G, Price PW. Cynipid gall-wasp communities correlate with oak chemistry. J Chem Ecol. 2003; 29:209-223.
- [79]Gilbert G, Hubbell SP. Plant diseases and the conservation of tropical forests. Bioscience. 1996; 46:98-106.
- [80]Wills C, Condit R, Foster RB, Hubbell SP. Strong density- and diversity-related effects help to maintain tree species diversity in a neotropical forest. Proc Natl Acad Sci. 1997; 94:1252-1257.
- [81]Kotera E, Tasaka M, Shikanai T. A pentatricopeptide repeat protein is essential for RNA editing in chloroplasts. Nature. 2005; 433:326-330.
- [82]Lurin C, Andrés C, Aubourg S, Bellaoui M, Bitton F, Bruyère C, Caboche M, Debast C, Gualberto J, Hoffmann B et al.. Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their essential role in organelle biogenesis. Plant Cell Online. 2004; 16:2089-2103.
- [83]Ioerger TR, Clark AG, Kao TH. Polymorphism at the self-incompatibility locus in Solanaceae predates speciation. Proc Natl Acad Sci. 1990; 87:9732-9735.
- [84]Dwyer K, Balent M, Nasrallah J, Nasrallah M. DNA sequences of self-incompatibility genes from Brassica campestris and B. oleracea: polymorphism predating speciation. Plant Mol Biol. 1991; 16:481-486.
- [85]Blanc G, Wolfe KH. Functional divergence of duplicated genes formed by polyploidy during Arabidopsis evolution. Plant Cell Online. 2004; 16:1679-1691.
- [86]Fay JC, Wu CI. Hitchhiking under positive Darwinian selection. Genetics. 2000; 155:1405-1413.
- [87]Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009; 10(3):R25.
- [88]Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J, Glöckner FO. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2013; 41:D590-D596.
- [89]Edgar RC. Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010; 26:2460-2461.
- [90]Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J. Repbase Update, a database of eukaryotic repetetive elements. Cytogenet Genome Res. 2005; 110:462-467.
- [91]St Laurent G, Shtokalo D, Tackett M, Yang Z, Eremina T, Wahlestedt C, Urcuqui-Inchima S, Seilheimer B, McCaffrey T, Kapranov P. Intronic RNAs constitute the major fraction of the non-coding RNA in mammalian cells. BMC Genomics. 2012; 13:504.
- [92]Kiełbasa SM, Wan R, Sato K, Horton P, Frith MC. Adaptive seeds tame genomic sequence comparison. Genome Res. 2011; 21:487-493.
- [93]Finn RD, Clements J, Eddy SR. HMMER web server: interactive sequence similarity searching. Nucleic Acids Res. 2011; 39:W29-W37.
- [94]Van der Auwera GA, Carneiro MO, Hartl C, Poplin R, del Angel G, Levy-Moonshine A, Jordan T, Shakir K, Roazen D, Thibault J et al.. From FastQ data to high-confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Current Protocols in Bioinformatics. Wiley, Hoboken; 2002.
- [95]DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M et al.. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet. 2011; 43:491-498.
- [96]Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP. Integrative genomics viewer. Nat Biotech. 2011; 29:24-26.
- [97]Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, Land SJ, Lu X, Ruden DM. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly. 2012; 6:80-92.
- [98]Pamilo P, Bianchi NO. Evolution of the Zfx and Zfy genes: rates and interdependence between the genes. Mol Biol Evol. 1993; 10:271-281.
- [99]Li W-H. Unbiased estimation of the rates of synonymous and nonsynonymous substitution. J Mol Evol. 1993; 36:96-99.
- [100]Kimura M. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J Mol Evol. 1980; 16:111-120.
- [101]Kosakovsky Pond SL, Frost SDW. Not so different after all: a comparison of methods for detecting amino acid sites under selection. Mol Biol Evol. 2005; 22:1208-1222.
- [102]Nei M, Gojobori T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol. 1986; 3:418-426.
- [103]Li WH, Wu CI, Luo CC. A new method for estimating synonymous and nonsynonymous rates of nucleotide substitution considering the relative likelihood of nucleotide and codon changes. Mol Biol Evol. 1985; 2:150-174.
- [104]Kryazhimskiy S, Plotkin JB. The population genetics of dN /dS. PLoS Genet. 2008; 4: Article ID e1000304
- [105]Mugal CF, Wolf JBW, Kaj I. Why time matters: codon evolution and the temporal dynamics of dN /dS. Mol Biol Evol. 2014; 31:212-231.
- [106]Liu J, Zhang Y, Lei X, Zhang Z. Natural selection of protein structural and functional properties: a single nucleotide polymorphism perspective. Genome Biol. 2008; 9:R69.
- [107]Wilcoxon F. Individual comparisons by ranking methods. Biom Bull. 1945; 1:80-83.
- [108]Mann HB, Whitney DR. On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat. 1947; 18:50-60.
- [109]Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci U S A. 2003; 100:9440-9445.
- [110]Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B Meth. 1995; 57:289-300.
- [111]Cai JJ. PGEToolbox: a MATLAB toolbox for population genetics and evolution. J Hered. 2008; 99:438-440.