| BMC Bioinformatics | |
| DiScRIBinATE: a rapid method for accurate taxonomic classification of metagenomic sequences | |
| Proceedings | |
| Tarini Shankar Ghosh1  Sharmila S Mande1  Monzoorul Haque M1  | |
| [1] Bio-Sciences Division, Innovation Labs, Tata Consultancy Services, 1 Software Units Layout, 500 081, Hyderabad, Andhra Pradesh, India; | |
| 关键词: Misclassification Rate; Correct Assignment; High Taxonomic Level; Alignment Quality; Database Variant; | |
| DOI : 10.1186/1471-2105-11-S7-S14 | |
| 来源: Springer | |
PDF
|
|
【 摘 要 】
BackgroundIn metagenomic sequence data, majority of sequences/reads originate from new or partially characterized genomes, the corresponding sequences of which are absent in existing reference databases. Since taxonomic assignment of reads is based on their similarity to sequences from known organisms, the presence of reads originating from new organisms poses a major challenge to taxonomic binning methods. The recently published SOrt-ITEMS algorithm uses an elaborate work-flow to assign reads originating from hitherto unknown genomes with significant accuracy and specificity. Nevertheless, a significant proportion of reads still get misclassified. Besides, the use of an alignment-based orthology step (for improving the specificity of assignments) increases the total binning time of SOrt-ITEMS.ResultsIn this paper, we introduce a rapid binning approach called DiScRIBinATE (Di stance Sc ore R atio for I mproved Bin ning A nd T axonomic E stimation). DiScRIBinATE replaces the orthology approach of SOrt-ITEMS with a quicker 'alignment-free' approach. We demonstrate that incorporating this approach reduces binning time by half without any loss in the specificity and accuracy of assignments. Besides, a novel reclassification strategy incorporated in DiScRIBinATE results in reducing the overall misclassification rate to around 3 - 7%. This misclassification rate is 1.5 - 3 times lower as compared to that by SOrt-ITEMS, and 3 - 30 times lower as compared to that by MEGAN.ConclusionsA significant reduction in binning time, coupled with a superior assignment accuracy (as compared to existing binning methods), indicates the immense applicability of the proposed algorithm in rapidly mapping the taxonomic diversity of large metagenomic samples with high accuracy and specificity.AvailabilityThe program is available on request from the authors.
【 授权许可】
Unknown
© Ghosh et al; licensee BioMed Central Ltd. 2010. This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| RO202311104449387ZK.pdf | 3133KB |
【 参考文献 】
- [1]
- [2]
- [3]
- [4]
- [5]
- [6]
- [7]
- [8]
- [9]
PDF