期刊论文详细信息
BMC Bioinformatics
DectICO: an alignment-free supervised metagenomic classification method based on feature extraction and dynamic selection
Methodology Article
Changchang Cao1  Fudong Cheng1  Xiao Sun1  Xiao Ding1 
[1] State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, 210096, Nanjing, China;
关键词: Alignment-free;    Metagenome;    Classification;    Sequence feature;    Feature selection;   
DOI  :  10.1186/s12859-015-0753-3
 received in 2015-04-16, accepted in 2015-09-28,  发布年份 2015
来源: Springer
PDF
【 摘 要 】

BackgroundContinual progress in next-generation sequencing allows for generating increasingly large metagenomes which are over time or space. Comparing and classifying the metagenomes with different microbial communities is critical. Alignment-free supervised classification is important for discriminating between the multifarious components of metagenomic samples, because it can be accomplished independently of known microbial genomes.ResultsWe propose an alignment-free supervised metagenomic classification method called DectICO. The intrinsic correlation of oligonucleotides provides the feature set, which is selected dynamically using a kernel partial least squares algorithm, and the feature matrices extracted with this set are sequentially employed to train classifiers by support vector machine (SVM). We evaluated the classification performance of DectICO on three actual metagenomic sequencing datasets, two containing deep sequencing metagenomes and one of low coverage. Validation results show that DectICO is powerful, performs well based on long oligonucleotides (i.e., 6-mer to 8-mer), and is more stable and generalized than a sequence-composition-based method. The classifiers trained by our method are more accurate than non-dynamic feature selection methods and a recently published recursive-SVM-based classification approach.ConclusionsThe alignment-free supervised classification method DectICO can accurately classify metagenomic samples without dependence on known microbial genomes. Selecting the ICO dynamically offers better stability and generality compared with sequence-composition-based classification algorithms. Our proposed method provides new insights in metagenomic sample classification.

【 授权许可】

CC BY   
© Ding et al. 2015

【 预 览 】
附件列表
Files Size Format View
RO202311090279250ZK.pdf 942KB PDF download
【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  文献评价指标  
  下载次数:5次 浏览次数:1次