期刊论文详细信息
BMC Genomics
An improved genome release (version Mt4.0) for the model legume Medicago truncatula
Christopher D Town6  David C Schwartz3  Klaus FX Mayer1  Heidrun Gundlach1  Mark Yandell2  Kevin L Childs4  Laurent Gentzbittel5  Shiguo Zhou3  Agnes Chan6  Benjamin Rosen6  Shelby Bidwell6  Vivek Krishnakumar6  Haibao Tang6 
[1] MIPS/IBIS Inst. for Bioinformatics and System Biology, Helmholtz Center Munich, German Research Center for Environmental Health (GmbH), Neuherberg, Germary;Department of Human Genetics, University of Utah, Salt Lake City, Utah, USA;Laboratory for Molecular and Computational Genomic, Department of Chemistry, University of Wisconsin-Madison, Madison, WI, USA;Department of Plant Biology, Michigan State University, East Lansing, MI, USA;Université de Toulouse, INP-ENSAT, CNRS, Laboratoire d’Écologie Fonctionnelle et Environnement, Toulouse, France;J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD, USA
关键词: Optical map;    Gene annotation;    Genome assembly;    Legume;    Medicago;   
Others  :  1217402
DOI  :  10.1186/1471-2164-15-312
 received in 2014-02-21, accepted in 2014-04-22,  发布年份 2014
PDF
【 摘 要 】

Background

Medicago truncatula, a close relative of alfalfa, is a preeminent model for studying nitrogen fixation, symbiosis, and legume genomics. The Medicago sequencing project began in 2003 with the goal to decipher sequences originated from the euchromatic portion of the genome. The initial sequencing approach was based on a BAC tiling path, culminating in a BAC-based assembly (Mt3.5) as well as an in-depth analysis of the genome published in 2011.

Results

Here we describe a further improved and refined version of the M. truncatula genome (Mt4.0) based on de novo whole genome shotgun assembly of a majority of Illumina and 454 reads using ALLPATHS-LG. The ALLPATHS-LG scaffolds were anchored onto the pseudomolecules on the basis of alignments to both the optical map and the genotyping-by-sequencing (GBS) map. The Mt4.0 pseudomolecules encompass ~360 Mb of actual sequences spanning 390 Mb of which ~330 Mb align perfectly with the optical map, presenting a drastic improvement over the BAC-based Mt3.5 which only contained 70% sequences (~250 Mb) of the current version. Most of the sequences and genes that previously resided on the unanchored portion of Mt3.5 have now been incorporated into the Mt4.0 pseudomolecules, with the exception of ~28 Mb of unplaced sequences. With regard to gene annotation, the genome has been re-annotated through our gene prediction pipeline, which integrates EST, RNA-seq, protein and gene prediction evidences. A total of 50,894 genes (31,661 high confidence and 19,233 low confidence) are included in Mt4.0 which overlapped with ~82% of the gene loci annotated in Mt3.5. Of the remaining genes, 14% of the Mt3.5 genes have been deprecated to an “unsupported” status and 4% are absent from the Mt4.0 predictions.

Conclusions

Mt4.0 and its associated resources, such as genome browsers, BLAST-able datasets and gene information pages, can be found on the JCVI Medicago web site (http://www.jcvi.org/medicago webcite). The assembly and annotation has been deposited in GenBank (BioProject: PRJNA10791). The heavily curated chromosomal sequences and associated gene models of Medicago will serve as a better reference for legume biology and comparative genomics.

【 授权许可】

   
2014 Tang et al.; licensee BioMed Central Ltd.

【 预 览 】
附件列表
Files Size Format View
20150706095805444.pdf 4686KB PDF download
Figure 7. 139KB Image download
Figure 6. 81KB Image download
Figure 5. 34KB Image download
Figure 4. 127KB Image download
Figure 3. 107KB Image download
Figure 2. 176KB Image download
Figure 1. 75KB Image download
【 图 表 】

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

【 参考文献 】
  • [1]Bennett MD, Leitch IJ: Nuclear DNA amounts in angiosperms: targets, trends and tomorrow. Ann Bot 2011, 107(3):467-590.
  • [2]Wang D, Griffitts J, Starker C, Fedorova E, Limpens E, Ivanov S, Bisseling T, Long S: A nodule-specific protein secretory pathway required for nitrogen-fixing symbiosis. Science 2010, 327(5969):1126-1129.
  • [3]Young ND, Debelle F, Oldroyd GE, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KF, Gouzy J, Schoof H, Van de Peer Y, Proost S, Cook DR, Meyers BC, Spannagl M, Cheung F, De Mita S, Krishnakumar V, Gundlach H, Zhou S, Mudge J, Bharti AK, Murray JD, Naoumkina MA, Rosen B, Silverstein KA, Tang H, Rombauts S, Zhao PX, Zhou P, et al.: The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature 2011, 480(7378):520-524.
  • [4]Zhou S, Bechner MC, Place M, Churas CP, Pape L, Leong SA, Runnheim R, Forrest DK, Goldstein S, Livny M, Schwartz DC: Validation of rice genome sequence by optical mapping. BMC Genomics 2007, 8:278. BioMed Central Full Text
  • [5]Kawahara Y, de la Bastide M, Hamilton JP, Kanamori H, McCombie WR, Ouyang S, Schwartz DC, Tanaka T, Wu J, Zhou S, Childs KL, Davidson RM, Lin H, Quesada-Ocampo L, Vaillancourt B, Sakai H, Lee SS, Kim J, Numa H, Itoh T, Buell CR, Matsumoto T: Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice 2013, 6(1):4. BioMed Central Full Text
  • [6]Li Z, Zhang Z, Yan P, Huang S, Fei Z, Lin K: RNA-Seq improves annotation of protein-coding genes in the cucumber genome. BMC Genomics 2011, 12:540. BioMed Central Full Text
  • [7]Volkening JD, Bailey DJ, Rose CM, Grimsrud PA, Howes-Podoll M, Venkateshwaran M, Westphall MS, Ane JM, Coon JJ, Sussman MR: A proteogenomic survey of the Medicago truncatula genome. Mol Cell Proteomics: MCP 2012, 11(10):933-944.
  • [8]Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, White O, Buell CR, Wortman JR: Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 2008, 9(1):R7. BioMed Central Full Text
  • [9]Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, Holt C, Sanchez Alvarado A, Yandell M: MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res 2008, 18(1):188-196.
  • [10]Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, Sharpe T, Hall G, Shea TP, Sykes S, Berlin AM, Aird D, Costello M, Daza R, Williams L, Nicol R, Gnirke A, Nusbaum C, Lander ES, Jaffe DB: High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A 2011, 108(4):1513-1518.
  • [11]Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y, Tang J, Wu G, Zhang H, Shi Y, Liu Y, Yu C, Wang B, Lu Y, Han C, Cheung DW, Yiu SM, Peng S, Xiaoqian Z, Liu G, Liao X, Li Y, Yang H, Wang J, Lam TW, Wang J: SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 2012, 1(1):18. BioMed Central Full Text
  • [12]Pop M, Kosack DS, Salzberg SL: Hierarchical scaffolding with Bambus. Genome Res 2004, 14(1):149-159.
  • [13]Ben C, Toueni M, Montanari S, Tardin MC, Fervel M, Negahi A, Saint-Pierre L, Mathieu G, Gras MC, Noel D, Prosperi JM, Pilet-Nayel ML, Baranger A, Huguet T, Julier B: Natural diversity in the model legume Medicago truncatula allows identifying distinct genetic mechanisms conferring partial resistance to Verticillium wilt. J Exp Bot 2013, 64(1):317-332.
  • [14]Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, Mitchell SE: A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One 2011, 6(5):e19379.
  • [15]Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009, 25(14):1754-1760.
  • [16]Li H: A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 2011, 27(21):2987-2993.
  • [17]Valouev A, Li L, Liu YC, Schwartz DC, Yang Y, Zhang Y, Waterman MS: Alignment of optical maps. J Comput Biol 2006, 13(2):442-462.
  • [18]Zhou S, Wei F, Nguyen J, Bechner M, Potamousis K, Goldstein S, Pape L, Mehan MR, Churas C, Pasternak S, Forrest DK, Wise R, Ware D, Wing RA, Waterman MS, Livny M, Schwartz DC: A single molecule scaffold for the maize genome. PLoS Genet 2009, 5(11):e1000711.
  • [19]Teague B, Waterman MS, Goldstein S, Potamousis K, Zhou S, Reslewic S, Sarkar D, Valouev A, Churas C, Kidd JM, Kohn S, Runnheim R, Lamers C, Forrest D, Newton MA, Eichler EE, Kent-First M, Surti U, Livny M, Schwartz DC: High-resolution human genome structure by single-molecule analysis. Proc Natl Acad Sci U S A 2010, 107(24):10848-10853.
  • [20]Zhou P, Silverstein KA, Gao L, Walton JD, Nallu S, Guhlin J, Young ND: Detecting small plant peptides using SPADA (Small Peptide Alignment Discovery Application). BMC Bioinformatics 2013, 14:335. BioMed Central Full Text
  • [21]Min XJ, Butler G, Storms R, Tsang A: TargetIdentifier: a webserver for identifying full-length cDNAs from EST sequences. Nucleic Acids Res 2005, 33(Web Server issue):W669-W672.
  • [22]Skinner ME, Uzilov AV, Stein LD, Mungall CJ, Holmes IH: JBrowse: a next-generation genome browser. Genome Res 2009, 19(9):1630-1638.
  • [23]Lowe TM, Eddy SR: tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 1997, 25(5):955-964.
  • [24]Rice P, Longden I, Bleasby A: EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet 2000, 16(6):276-277.
  • [25]Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK Jr, Hannick LI, Maiti R, Ronning CM, Rusch DB, Town CD, Salzberg SL, White O: Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res 2003, 31(19):5654-5666.
  • [26]Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A: Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 2011, 29(7):644-652.
  • [27]Li B, Dewey CN: RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 2011, 12:323. BioMed Central Full Text
  • [28]Kielbasa SM, Wan R, Sato K, Horton P, Frith MC: Adaptive seeds tame genomic sequence comparison. Genome Res 2011, 21(3):487-493.
  • [29]Tang H, Lyons E, Pedersen B, Schnable JC, Paterson AH, Freeling M: Screening synteny blocks in pairwise genome comparisons through integer programming. BMC Bioinformatics 2011, 12:102. BioMed Central Full Text
  • [30]Parra G, Bradnam K, Korf I: CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 2007, 23(9):1061-1067.
  • [31]Kamphuis LG, Williams AH, D’Souza NK, Pfaff T, Ellwood SR, Groves EJ, Singh KB, Oliver RP, Lichtenzveig J: The Medicago truncatula reference accession A17 has an aberrant chromosomal configuration. New Phytol 2007, 174(2):299-303.
  • [32]Julier B, Huguet T, Chardon F, Ayadi R, Pierre JB, Prosperi JM, Barre P, Huyghe C: Identification of quantitative trait loci influencing aerial morphogenesis in the model legume Medicago truncatula. Theor Appl Genet 2007, 114(8):1391-1406.
  • [33]Jiang N, Bao Z, Zhang X, Eddy SR, Wessler SR: Pack-MULE transposable elements mediate gene evolution in plants. Nature 2004, 431(7008):569-573.
  • [34]Lai J, Li Y, Messing J, Dooner HK: Gene movement by Helitron transposons contributes to the haplotype variability of maize. Proc Natl Acad Sci U S A 2005, 102(25):9068-9073.
  • [35]Foissac S, Gouzy J, Rombauts S, Mathe C, Amselem J, Sterck L, Van de Peer Y, Rouze P, Schiex T: Genome Annotation in Plants and Fungi: EuGene as a Model Platform. Current Bioinformatics, Volume 3 2008, 87-97. (11)
  • [36]Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, et al.: Genome sequence of the palaeopolyploid soybean. Nature 2010, 463(7278):178-183.
  • [37]Cannon SB, Sterck L, Rombauts S, Sato S, Cheung F, Gouzy J, Wang X, Mudge J, Vasdewani J, Schiex T, Spannagl M, Monaghan E, Nicholson C, Humphray SJ, Schoof H, Mayer KF, Rogers J, Quetier F, Oldroyd GE, Debelle F, Cook DR, Retzel EF, Roe BA, Town CD, Tabata S, Van de Peer Y, Young ND: Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes. Proc Natl Acad Sci U S A 2006, 103(40):14959-14964.
  • [38]Varshney RK, Chen W, Li Y, Bharti AK, Saxena RK, Schlueter JA, Donoghue MT, Azam S, Fan G, Whaley AM, Farmer AD, Sheridan J, Iwata A, Tuteja R, Penmetsa RV, Wu W, Upadhyaya HD, Yang SP, Shah T, Saxena KB, Michael T, McCombie WR, Yang B, Zhang G, Yang H, Wang J, Spillane C, Cook DR, May GD, Xu X, et al.: Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers. Nat Biotechnol 2012, 30(1):83-89.
  • [39]Varshney RK, Song C, Saxena RK, Azam S, Yu S, Sharpe AG, Cannon S, Baek J, Rosen BD, Tar’an B, Millan T, Zhang X, Ramsay LD, Iwata A, Wang Y, Nelson W, Farmer AD, Gaur PM, Soderlund C, Penmetsa RV, Xu C, Bharti AK, He W, Winter P, Zhao S, Hane JK, Carrasquilla-Garcia N, Condie JA, Upadhyaya HD, Luo MC, et al.: Draft genome sequence of chickpea (Cicer arietinum) provides a resource for trait improvement. Nat Biotechnol 2013, 31(3):240-246.
  • [40]Thibaud-Nissen F, Campbell M, Hamilton JP, Zhu W, Buell CR: EuCAP, a Eukaryotic Community Annotation Package, and its application to the rice genome. BMC Genomics 2007, 8:388. BioMed Central Full Text
  • [41]VandenBosch KA, Frugoli J: Guidelines for genetic nomenclature and community governance for the model legume Medicago truncatula. Mol Plant Microbe Interact 2001, 14(12):1364-1367.
  • [42]Swain MT, Tsai IJ, Assefa SA, Newbold C, Berriman M, Otto TD: A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs. Nat Protoc 2012, 7(7):1260-1284.
  文献评价指标  
  下载次数:75次 浏览次数:14次