期刊论文详细信息
BMC Bioinformatics
Long-read sequencing settings for efficient structural variation detection based on comprehensive evaluation
Tao Jiang1  Zhe Cui1  Shuqi Cao1  Yadong Wang1  Shiqi Liu1  Yadong Liu1  Hongzhe Guo1 
[1] Faculty of Computing, Harbin Institute of Technology, 150001, Harbin, China;
关键词: Long-read sequencing;    SV calling;    Coverage;    Read length;    Sequencing error;    Comprehensive evaluation;   
DOI  :  10.1186/s12859-021-04422-y
来源: Springer
PDF
【 摘 要 】

BackgroundWith the rapid development of long-read sequencing technologies, it is possible to reveal the full spectrum of genetic structural variation (SV). However, the expensive cost, finite read length and high sequencing error for long-read data greatly limit the widespread adoption of SV calling. Therefore, it is urgent to establish guidance concerning sequencing coverage, read length, and error rate to maintain high SV yields and to achieve the lowest cost simultaneously.ResultsIn this study, we generated a full range of simulated error-prone long-read datasets containing various sequencing settings and comprehensively evaluated the performance of SV calling with state-of-the-art long-read SV detection methods. The benchmark results demonstrate that almost all SV callers perform better when the long-read data reach 20× coverage, 20 kbp average read length, and approximately 10–7.5% or below 1% error rates. Furthermore, high sequencing coverage is the most influential factor in promoting SV calling, while it also directly determines the expensive costs.ConclusionsBased on the comprehensive evaluation results, we provide important guidelines for selecting long-read sequencing settings for efficient SV calling. We believe these recommended settings of long-read sequencing will have extraordinary guiding significance in cutting-edge genomic studies and clinical practices.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202112044858374ZK.pdf 2077KB PDF download
  文献评价指标  
  下载次数:13次 浏览次数:40次