期刊论文详细信息
BMC Bioinformatics
SLR-superscaffolder: a de novo scaffolding tool for synthetic long reads using a top-to-bottom scheme
Shengqiang Gu1  Lidong Guo2  Wenchao Wang3  Xin Liu4  Mengyang Xu4  Guangyi Fan4  Li Deng4  Ou Wang5  Xun Xu5  Inge Seim6  Xia Zhao7  Fang Chen7 
[1] BGI Education Center, University of Chinese Academy of Sciences, 518083, Shenzhen, China;BGI Education Center, University of Chinese Academy of Sciences, 518083, Shenzhen, China;BGI-Qingdao, BGI-Shenzhen, 266555, Qingdao, China;State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, 518083, Shenzhen, China;BGI-Qingdao, BGI-Shenzhen, 266555, Qingdao, China;BGI-Qingdao, BGI-Shenzhen, 266555, Qingdao, China;State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, 518083, Shenzhen, China;BGI-Shenzhen, 518083, Shenzhen, China;China National GeneBank, BGI-Shenzhen, 518120, Shenzhen, China;BGI-Shenzhen, 518083, Shenzhen, China;China National GeneBank, BGI-Shenzhen, 518120, Shenzhen, China;Integrative Biology Laboratory, College of Life Sciences, Nanjing Normal University, 210046, Nanjing, China;School of Biology and Environmental Science, Queensland University of Technology, 4000, Brisbane, Australia;MGI, BGI-Shenzhen, 518083, Shenzhen, China;
关键词: Genome assembly;    Synthetic long reads;    Next-generation sequencing;    Scaffolding;   
DOI  :  10.1186/s12859-021-04081-z
来源: Springer
PDF
【 摘 要 】

BackgroundSynthetic long reads (SLR) with long-range co-barcoding information are now widely applied in genomics research. Although several tools have been developed for each specific SLR technique, a robust standalone scaffolder with high efficiency is warranted for hybrid genome assembly.ResultsIn this work, we developed a standalone scaffolding tool, SLR-superscaffolder, to link together contigs in draft assemblies using co-barcoding and paired-end read information. Our top-to-bottom scheme first builds a global scaffold graph based on Jaccard Similarity to determine the order and orientation of contigs, and then locally improves the scaffolds with the aid of paired-end information. We also exploited a screening algorithm to reduce the negative effect of misassembled contigs in the input assembly. We applied SLR-superscaffolder to a human single tube long fragment read sequencing dataset and increased the scaffold NG50 of its corresponding draft assembly 1349 fold. Moreover, benchmarking on different input contigs showed that this approach overall outperformed existing SLR scaffolders, providing longer contiguity and fewer misassemblies, especially for short contigs assembled by next-generation sequencing data. The open-source code of SLR-superscaffolder is available at https://github.com/BGI-Qingdao/SLR-superscaffolder.ConclusionsSLR-superscaffolder can dramatically improve the contiguity of a draft assembly by integrating a hybrid assembly strategy.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202107023738337ZK.pdf 1875KB PDF download
  文献评价指标  
  下载次数:26次 浏览次数:13次