BMC Genomics | |
RiTE database: a resource database for genus-wide rice genomics and evolutionary biology | |
Rod A. Wing9  Olivier Panaud3  Scott A. Jackson1,14  Wen Wang1,16  Nori Kurata1,13  Yue-ie Hsing7  Robert Henry1,15  Bin Han1,10  Antonio Costa de Oliveira4  Mingsheng Chen5  Andrea Zuccolo1,11  Chuanzhu Fan6  Thomas Wicker1,12  Hajime Ohyanagi1,13  Stefan Roffler1,12  Carlos E. Maldonado L.2  Angelina Angelova1  Rosa M. Cossu1,11  Elena Barghini8  Jun Wang6  Dongying Gao1,14  Moaine El Baidouri1,14  Jianwei Zhang2  Dario Copetti9  | |
[1] School of Life Sciences, Heriot-Watt University, Edinburgh EH14 4AS, Scotland;Arizona Genomics Institute, BIO5 Institute and School of Plant Sciences, University of Arizona, Tucson 85721, AZ, United States;Laboratoire Génome et Développement des Plantes and CNRS and Laboratoire Génome et Développements des Plantes, Université de Perpignan Via Domitia, UMR CNRS/UPVD 5096, Perpignan, 66860, France;Plant Genomics and Breeding Center, Federal University of Pelotas, Pelotas-RS, Brazil;State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology Chinese Academy of Sciences, Beijing 100101, China;Department of Biological Sciences, Wayne State University, Detroit 48202, MI, United States;Institute of Plant and Microbial Biology, Academia Sinica, Nankang, Taipei 11529, Taiwan;Department of Agriculture, Food, and Environment, University of Pisa, Pisa, 56124, Italy;International Rice Research Institute, Genetic Resource Center, Los Baños, Laguna, Philippines;National Center for Gene Research and Institute of Plant Physiology and Ecology, Shanghai Institutes of Biological Sciences, Chinese Academy of Sciences, Beijing 100029, China;Institute of Life Sciences, Scuola Superiore Sant’Anna, Pisa, 56127, Italy;Institute of Plant Biology, University of Zürich, Zollikerstrasse 107, Zürich, 8008, Switzerland;Plant Genetics Laboratory, National Institute of Genetics, Mishima 411-8540, Shizuoka, Japan;Center for Applied Genetic Technologies, University of Georgia, Athens 30602, GA, United States;Queensland Alliance for Agriculture and Food Innovation, University of Queensland, Brisbane QLD 4072, Australia;State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences and University of Chinese Academy of Sciences, No. 32 Jiaochang Donglu, Kunming 650223, Yunnan, China | |
关键词: RiTE-db; Genome; Repeats; Transposable elements; Oryza; Rice; | |
Others : 1222463 DOI : 10.1186/s12864-015-1762-3 |
|
received in 2015-03-11, accepted in 2015-07-09, 发布年份 2015 |
【 摘 要 】
Background
Comparative evolutionary analysis of whole genomes requires not only accurate annotation of gene space, but also proper annotation of the repetitive fraction which is often the largest component of most if not all genomes larger than 50 kb in size.
Results
Here we present the Rice TE database (RiTE-db) - a genus-wide collection of transposable elements and repeated sequences across 11 diploid species of the genus Oryza and the closely-related out-group Leersia perrieri. The database consists of more than 170,000 entries divided into three main types: (i) a classified and curated set of publicly-available repeated sequences, (ii) a set of consensus assemblies of highly-repetitive sequences obtained from genome sequencing surveys of 12 species; and (iii) a set of full-length TEs, identified and extracted from 12 whole genome assemblies.
Conclusions
This is the first report of a repeat dataset that spans the majority of repeat variability within an entire genus, and one that includes complete elements as well as unassembled repeats. The database allows sequence browsing, downloading, and similarity searches. Because of the strategy adopted, the RiTE-db opens a new path to unprecedented direct comparative studies that span the entire nuclear repeat content of 15 million years of Oryza diversity.
【 授权许可】
2015 Copetti et al.
Files | Size | Format | View |
---|---|---|---|
Fig. 3. | 58KB | Image | download |
Fig. 2. | 90KB | Image | download |
Fig. 1. | 59KB | Image | download |
【 图 表 】
Fig. 1.
Fig. 2.
Fig. 3.
【 参考文献 】
- [1]Bergman CM, Quesneville H. Discovering and detecting transposable elements in genome sequences. Brief Bioinform. 2007; 8:382-392.
- [2]Saha S, Bridges S, Magbanua ZV, Peterson DG. Computational Approaches and Tools Used in Identification of Dispersed Repetitive DNA Sequences. Trop Plant Biol. 2008; 1:85-96.
- [3]Edgar RC. PILER-CR: fast and accurate identification of CRISPR repeats. BMC Bioinformatics. 2007; 8:18. BioMed Central Full Text
- [4]Li R, Ye J, Li S, Wang J, Han Y, Ye C, Wang J, Yang H, Yu J, Wong GK-S, Wang J. ReAS: Recovery of ancestral sequences for transposable elements from the unassembled reads of a whole genome shotgun. PLoS Comput Biol. 2005; 1:e43.
- [5]Novák P, Neumann P, Pech J, Steinhaisl J, Macas J. RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitive elements from next-generation sequence reads. Bioinformatics. 2013; 29:792-793.
- [6]Ouyang S, Buell CR. The TIGR Plant Repeat Databases: a collective resource for the identification of repetitive sequences in plants. Nucleic Acids Res. 2004; 32(Database issue):D360-363.
- [7]Du J, Grant D, Tian Z, Nelson RT, Zhu L, Shoemaker RC, Ma J. SoyTEdb: a comprehensive database of transposable elements in the soybean genome. BMC Genomics. 2010; 11:113. BioMed Central Full Text
- [8]Grant D, Nelson RT, Cannon SB, Shoemaker RC. SoyBase, the USDA-ARS soybean genetics and genomics database. Nucleic Acids Res. 2010; 38(Database issue):D843-846.
- [9]Chaparro C, Guyot R, Zuccolo A, Piegu B, Panaud O. RetrOryza: a database of the rice LTR-retrotransposons. Nucleic Acids Res. 2007; 35(Database):D66-D70.
- [10]The map-based sequence of the rice genome. Nature. 2005; 436:793-800.
- [11]Chen J, Huang Q, Gao D, Wang J, Lang Y, Liu T, Li B, Bai Z, Luis Goicoechea J, Liang C, Chen C, Zhang W, Sun S, Liao Y, Zhang X, Yang L, Song C, Wang M, Shi J, Liu G, Liu J, Zhou H, Zhou W, Yu Q, An N, Chen Y, Cai Q, Wang B, Liu B, Min J et al.. Whole-genome sequencing of Oryza brachyantha reveals mechanisms underlying Oryza genome evolution. Nat Commun. 2013; 4:1595.
- [12]Wang M, Yu Y, Haberer G, Marri PR, Fan C, Goicoechea JL, Zuccolo A, Song X, Kudrna D, Ammiraju JSS, Cossu RM, Maldonado C, Chen J, Lee S, Sisneros N, de Baynast K, Golser W, Wissotski M, Kim W, Sanchez P, Ndjiondjop M-N, Sanni K, Long M, Carney J, Panaud O, Wicker T, Machado CA, Chen M, Mayer KFX, Rounsley S et al.. The genome sequence of African rice (Oryza glaberrima) and evidence for independent domestication. Nat Genet. 2014; 46:982-988.
- [13]Wenke T, Döbel T, Sörensen TR, Junghans H, Weisshaar B, Schmidt T. Targeted identification of short interspersed nuclear element families shows their widespread existence and extreme heterogeneity in plant genomes. Plant Cell. 2011; 23:3117-3128.
- [14]Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B, Flavell A, Leroy P, Morgante M, Panaud O, Paux E, SanMiguel P, Schulman AH. A unified classification system for eukaryotic transposable elements. Nat Rev Genet. 2007; 8:973-982.
- [15]Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012; 9:357-359.
- [16]Green P. Phrap documentation, 1996. 1996
- [17]Ellinghaus D, Kurtz S, Willhoeft U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics. 2008; 9:18. BioMed Central Full Text
- [18]Baidouri ME, Panaud O. Comparative Genomic Paleontology across Plant Kingdom Reveals the Dynamics of TE-Driven Genome Evolution. Genome Biol Evol. 2013; 5:954-965.
- [19]Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999; 27:573-580.
- [20]Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 2007; 35(Web Server issue):W265-268.
- [21]Ferguson AA, Zhao D, Jiang N. Selective acquisition and retention of genomic sequences by Pack-Mutator-like elements based on guanine-cytosine content and the breadth of expression. Plant Physiol. 2013; 163:1419-1432.
- [22]Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. Bioinforma Oxf Engl. 2005; 21 Suppl 1:i351-358.
- [23]Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG. Clustal W and Clustal X version 2.0. Bioinforma Oxf Engl. 2007; 23:2947-2948.
- [24]Sonnhammer EL, Durbin R. A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene. 1995; 167:GC1-10.
- [25]Jacquemin J, Bhatia D, Singh K, Wing RA. The International Oryza Map Alignment Project: development of a genus-wide comparative genomics platform to help solve the 9 billion-people question. Curr Opin Plant Biol. 2013; 16:147-156.
- [26]Sakai H, Kanamori H, Arai-Kichise Y, Shibata-Hatta M, Ebana K, Oono Y, Kurita K, Fujisawa H, Katagiri S, Mukai Y, Hamada M, Itoh T, Matsumoto T, Katayose Y, Wakasa K, Yano M, Wu J. Construction of pseudomolecule sequences of the aus rice cultivar Kasalath for comparative genomics of asian cultivated rice. DNA Res Int J Rapid Publ Rep Genes Genomes. 2014; 21:397-405.
- [27]The 3,000 rice genomes project. The 3,000 rice genomes project. GigaScience 2014, 3:7.
- [28]Wheeler TJ, Clements J, Eddy SR, Hubley R, Jones TA, Jurka J, Smit AFA, Finn RD. Dfam: a database of repetitive DNA based on profile hidden Markov models. Nucleic Acids Res. 2012; 41:D70-D82.
- [29]Ammiraju JSS. The Oryza bacterial artificial chromosome library resource: Construction and analysis of 12 deep-coverage large-insert BAC libraries that represent the 10 genome types of the genus Oryza. Genome Res. 2005; 16:140-147.
- [30]Yu J, Hu S, Wang J, Wong GK-S, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X, Cao M, Liu J, Sun J, Tang J, Chen Y, Huang X, Lin W, Ye C, Tong W, Cong L, Geng J, Han Y, Li L, Li W, Hu G, Huang X, Li W, Li J, Liu Z, Li L et al.. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science. 2002; 296:79-92.
- [31]Martinez CP, Arumuganathan K, Kikuchi H, Earle ED. Nuclear DNA content of ten rice species as determined by flow cytometry. Jpn J Genet. 1994; 69:513-523.
- [32]Ammiraju JSS, Song X, Luo M, Sisneros N, Angelova A, Kudrna D, Kim H, Yu Y, Goicoechea JL, Lorieux M, Kurata N, Brar D, Ware D, Jackson S, Wing RA. The Oryza BAC resource: a genus-wide and genome scale tool for exploring rice genome evolution and leveraging useful genetic diversity from wild relatives. Breed Sci. 2010; 60:536-543.