期刊论文详细信息
Genome Biology
NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks
Mian Umair Ahsan1  Qian Liu1  Li Fang1  Kai Wang2 
[1] Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, 19104, Philadelphia, PA, USA;Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children’s Hospital of Philadelphia, 19104, Philadelphia, PA, USA;Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, 19104, Philadelphia, PA, USA;
关键词: Variant calling;    Long-range haplotype;    Deep learning;    Difficult-to-map regions;   
DOI  :  10.1186/s13059-021-02472-2
来源: Springer
PDF
【 摘 要 】

Long-read sequencing enables variant detection in genomic regions that are considered difficult-to-map by short-read sequencing. To fully exploit the benefits of longer reads, here we present a deep learning method NanoCaller, which detects SNPs using long-range haplotype information, then phases long reads with called SNPs and calls indels with local realignment. Evaluation on 8 human genomes demonstrates that NanoCaller generally achieves better performance than competing approaches. We experimentally validate 41 novel variants in a widely used benchmarking genome, which could not be reliably detected previously. In summary, NanoCaller facilitates the discovery of novel variants in complex genomic regions from long-read sequencing.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202110147782111ZK.pdf 3259KB PDF download
  文献评价指标  
  下载次数:2次 浏览次数:2次