期刊论文详细信息
Journal of computational biology: A journal of computational molecular cell biology
Continuous Embeddings of DNA Sequencing Reads and Application to Metagenomics
Jean-PhilippeVert^3,1,2,4,54  RomainMenegaux^1,25 
[1] Address correspondence to: Dr. Jean–Philippe Vert, Google Brain, 8 rue de Londres, 75009 Paris, France^3;Ecole Normale Supérieure, Department of Mathematics and Applications, CNRS, PSL Research University, Paris, France^4;Google Brain, Paris, France^5;Institut Curie, PSL Research University, INSERM, U900, Paris, France^2;MINES ParisTech, PSL Research University, CBIO—Centre for Computational Biology, Paris, France^1
关键词: metagenomics;    sequencing;    classification;    embedding.;   
DOI  :  10.1089/cmb.2018.0174
学科分类:生物科学(综合)
来源: Mary Ann Liebert, Inc. Publishers
PDF
【 摘 要 】

We propose a new model for fast classification of DNA sequences output by next-generation sequencing machines. The model, which we call fastDNA, embeds DNA sequences in a vector space by learning continuous low-dimensional representations of thek-mers it contains. We show on metagenomics benchmarks that it outperforms the state-of-the-art methods in terms of accuracy and scalability.

【 授权许可】

Unknown   

【 预 览 】
附件列表
Files Size Format View
RO201910257745256ZK.pdf 765KB PDF download
  文献评价指标  
  下载次数:4次 浏览次数:4次