期刊论文详细信息
BMC Genomics
Improvement of the banana “Musa acuminata” reference sequence using NGS data and semi-automated bioinformatics methods
Methodology Article
Alex Hastie1  Alberto Cenci2  Mathieu Rouard2  Guillaume Martin3  Franc-Christophe Baurens3  Françoise Carreel3  Angélique D’Hont3  Gaëtan Droc3  Jean-Marc Aury4  Adriana Alberti4  Andrzej Kilian5  Jaroslav Doležel6 
[1] BioNano Genomics, 9640 Towne Centre Drive, 92121, San Diego, CA, USA;Bioversity International, Parc Scientifique Agropolis II, 34397, Cedex 5, Montpellier, France;CIRAD (Centre de coopération Internationale en Recherche Agronomique pour le Développement), UMR AGAP, TA A-108/03, Avenue Agropolis, F-34398, cedex 5, Montpellier, France;Commissariat à l’Energie Atomique (CEA), Institut de Genomique (IG), Genoscope, 2 rue Gaston Cremieux, BP5706, 91057, Evry, France;Diversity Arrays Technology, 2600, Yarralumla, Australian Capital Territory, Australia;Institute of Experimental Botany, Centre of the Region Hana for Biotechnological and Agricultural Research, Šlechtitelů 31, CZ-78371, Olomouc, Czech Republic;
关键词: Musa acuminata;    Genome assembly;    Bioinformatics tool;    Paired-end sequences;    GBS;    Genome map;   
DOI  :  10.1186/s12864-016-2579-4
 received in 2015-08-04, accepted in 2016-03-08,  发布年份 2016
来源: Springer
PDF
【 摘 要 】

BackgroundRecent advances in genomics indicate functional significance of a majority of genome sequences and their long range interactions. As a detailed examination of genome organization and function requires very high quality genome sequence, the objective of this study was to improve reference genome assembly of banana (Musa acuminata).ResultsWe have developed a modular bioinformatics pipeline to improve genome sequence assemblies, which can handle various types of data. The pipeline comprises several semi-automated tools. However, unlike classical automated tools that are based on global parameters, the semi-automated tools proposed an expert mode for a user who can decide on suggested improvements through local compromises. The pipeline was used to improve the draft genome sequence of Musa acuminata. Genotyping by sequencing (GBS) of a segregating population and paired-end sequencing were used to detect and correct scaffold misassemblies. Long insert size paired-end reads identified scaffold junctions and fusions missed by automated assembly methods. GBS markers were used to anchor scaffolds to pseudo-molecules with a new bioinformatics approach that avoids the tedious step of marker ordering during genetic map construction. Furthermore, a genome map was constructed and used to assemble scaffolds into super scaffolds. Finally, a consensus gene annotation was projected on the new assembly from two pre-existing annotations. This approach reduced the total Musa scaffold number from 7513 to 1532 (i.e. by 80 %), with an N50 that increased from 1.3 Mb (65 scaffolds) to 3.0 Mb (26 scaffolds). 89.5 % of the assembly was anchored to the 11 Musa chromosomes compared to the previous 70 %. Unknown sites (N) were reduced from 17.3 to 10.0 %.ConclusionThe release of the Musa acuminata reference genome version 2 provides a platform for detailed analysis of banana genome variation, function and evolution. Bioinformatics tools developed in this work can be used to improve genome sequence assemblies in other species.

【 授权许可】

CC BY   
© Martin et al. 2016

【 预 览 】
附件列表
Files Size Format View
RO202311091008298ZK.pdf 1725KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  文献评价指标  
  下载次数:6次 浏览次数:0次