期刊论文详细信息
Journal of computational biology: A journal of computational molecular cell biology | |
Continuous Embeddings of DNA Sequencing Reads and Application to Metagenomics | |
Jean-PhilippeVert^3,1,2,4,54  RomainMenegaux^1,25  | |
[1] Address correspondence to: Dr. Jean–Philippe Vert, Google Brain, 8 rue de Londres, 75009 Paris, France^3;Ecole Normale Supérieure, Department of Mathematics and Applications, CNRS, PSL Research University, Paris, France^4;Google Brain, Paris, France^5;Institut Curie, PSL Research University, INSERM, U900, Paris, France^2;MINES ParisTech, PSL Research University, CBIO—Centre for Computational Biology, Paris, France^1 | |
关键词: metagenomics; sequencing; classification; embedding.; | |
DOI : 10.1089/cmb.2018.0174 | |
学科分类:生物科学(综合) | |
来源: Mary Ann Liebert, Inc. Publishers | |
【 摘 要 】
We propose a new model for fast classification of DNA sequences output by next-generation sequencing machines. The model, which we call fastDNA, embeds DNA sequences in a vector space by learning continuous low-dimensional representations of thek-mers it contains. We show on metagenomics benchmarks that it outperforms the state-of-the-art methods in terms of accuracy and scalability.
【 授权许可】
Unknown
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO201910257745256ZK.pdf | 765KB | download |