期刊论文详细信息
BMC Medical Genomics
NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer
Richard Stratford1  Angelina Sverchkova1  Irantzu Anzar1  Trevor Clancy1 
[1] grid.458653.9, OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway;
关键词: Somatic variant detection;    Machine learning;    Cancer genomics;    Precision medicine;   
DOI  :  10.1186/s12920-019-0508-5
来源: publisher
PDF
【 摘 要 】

BackgroundThe accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity coupled with the problem of sequencing and alignment artifacts, makes somatic variant calling a challenging task. Current variant filtering strategies, such as rule-based filtering and consensus voting of different algorithms, have previously helped to increase specificity, although comes at the cost of sensitivity.MethodsIn light of this, we have developed the NeoMutate framework which incorporates 7 supervised machine learning (ML) algorithms to exploit the strengths of multiple variant callers, using a non-redundant set of biological and sequence features. We benchmarked NeoMutate by simulating more than 10,000 bona fide cancer-related mutations into three well-characterized Genome in a Bottle (GIAB) reference samples.ResultsA robust and exhaustive evaluation of NeoMutate’s performance based on 5-fold cross validation experiments, in addition to 3 independent tests, demonstrated a substantially improved variant detection accuracy compared to any of its individual composite variant callers and consensus calling of multiple tools.ConclusionsWe show here that integrating multiple tools in an ensemble ML layer optimizes somatic variant detection rates, leading to a potentially improved variant selection framework for the diagnosis and treatment of cancer.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202004237001912ZK.pdf 3106KB PDF download
  文献评价指标  
  下载次数:11次 浏览次数:8次