期刊论文详细信息
BMC Bioinformatics
Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb)
  1    2    2    2    2    3    4    5 
[1] 0000 0001 0940 9855, grid.412592.9, Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia;0000 0001 0940 9855, grid.412592.9, Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia;0000 0001 0940 9855, grid.412592.9, Department of High Performance Computing, Institute of Space and Information Technologies, Siberian Federal University, 660074, Krasnoyarsk, Russia;0000 0001 0940 9855, grid.412592.9, Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia;0000 0001 2254 1834, grid.415877.8, Laboratory of Forest Genetics and Selection, V. N. Sukachev Institute of Forest, Siberian Branch of Russian Academy of Sciences, 660036, Krasnoyarsk, Russia;0000 0001 0940 9855, grid.412592.9, Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia;0000 0001 2364 4210, grid.7450.6, Department of Forest Genetics and Forest Tree Breeding, Georg-August University of Göttingen, 37077, Göttingen, Germany;0000 0001 2192 9124, grid.4886.2, Laboratory of Population Genetics, N. I. Vavilov Institute of General Genetics, Russian Academy of Sciences, 119333, Moscow, Russia;0000 0004 4687 2082, grid.264756.4, Department of Ecosystem Science and Management, Texas A&M University, 77843-2138, College Station, TX, USA;0000 0001 0940 9855, grid.412592.9, Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia;Department of Informatics, National Research Technical University, 664074, Irkutsk, Russia;0000 0001 2254 1834, grid.415877.8, Limnological Institute, Siberian Branch of Russian Academy of Sciences, 664033, Irkutsk, Russia;
关键词: de novo genome assembly;    Siberian larch;    Larix sibirica;   
DOI  :  10.1186/s12859-018-2570-y
来源: publisher
PDF
【 摘 要 】

BackgroundDe novo assembling of large genomes, such as in conifers (~ 12–30 Gbp), which also consist of ~ 80% of repetitive DNA, is a very complex and computationally intense endeavor. One of the main problems in assembling such genomes lays in computing limitations of nucleotide sequence assembly programs (DNA assemblers). As a rule, modern assemblers are usually designed to assemble genomes with a length not exceeding the length of the human genome (3.24 Gbp). Most assemblers cannot handle the amount of input sequence data required to provide sufficient coverage needed for a high-quality assembly.ResultsAn original stepwise method of de novo assembly by parts (sets), which allows to bypass the limitations of modern assemblers associated with a huge amount of data being processed, is presented in this paper. The results of numerical assembling experiments conducted using the model plant Arabidopsis thaliana, Prunus persica (peach) and four most popular assemblers, ABySS, SOAPdenovo, SPAdes, and CLC Assembly Cell, showed the validity and effectiveness of the proposed stepwise assembling method.ConclusionUsing the new stepwise de novo assembling method presented in the paper, the genome of Siberian larch, Larix sibirica Ledeb. (12.34 Gbp) was completely assembled de novo by the CLC Assembly Cell assembler. It is the first genome assembly for larch species in addition to only five other conifer genomes sequenced and assembled for Picea abies, Picea glauca, Pinus taeda, Pinus lambertiana, and Pseudotsuga menziesii var. menziesii.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO201909244262534ZK.pdf 2072KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:7次