期刊论文详细信息
BioData Mining
Comparison of 16S and whole genome dog microbiomes using machine learning
Qinghong Li1  Andrea Nash1  Scott Lewis2  Tae-Hyuk Ahn3 
[1] Nestlé Purina Research, St. Louis, MO, USA;Program in Bioinformatics and Computational Biology, Saint Louis University, St. Louis, MO, USA;Program in Bioinformatics and Computational Biology, Saint Louis University, St. Louis, MO, USA;Department of Computer Science, Saint Louis University, St. Louis, MO, USA;
关键词: Metagenomics;    Microbiome;    Machine learning;    Classification;    Supervised learning;    Diabetes;    Dog;    Diet;    Whole genome shotgun;    16S amplicon;   
DOI  :  10.1186/s13040-021-00270-x
来源: Springer
PDF
【 摘 要 】

BackgroundRecent advances in sequencing technologies have driven studies identifying the microbiome as a key regulator of overall health and disease in the host. Both 16S amplicon and whole genome shotgun sequencing technologies are currently being used to investigate this relationship, however, the choice of sequencing technology often depends on the nature and experimental design of the study. In principle, the outputs rendered by analysis pipelines are heavily influenced by the data used as input; it is then important to consider that the genomic features produced by different sequencing technologies may emphasize different results.ResultsIn this work, we use public 16S amplicon and whole genome shotgun sequencing (WGS) data from the same dogs to investigate the relationship between sequencing technology and the captured gut metagenomic landscape in dogs. In our analyses, we compare the taxonomic resolution at the species and phyla levels and benchmark 12 classification algorithms in their ability to accurately identify host phenotype using only taxonomic relative abundance information from 16S and WGS datasets with identical study designs. Our best performing model, a random forest trained by the WGS dataset, identified a species (Bacteroides coprocola) that predominantly contributes to the abundance of leuB, a gene involved in branched chain amino acid biosynthesis; a risk factor for glucose intolerance, insulin resistance, and type 2 diabetes. This trend was not conserved when we trained the model using 16S sequencing profiles from the same dogs.ConclusionsOur results indicate that WGS sequencing of dog microbiomes detects a greater taxonomic diversity than 16S sequencing of the same dogs at the species level and with respect to four gut-enriched phyla levels. This difference in detection does not significantly impact the performance metrics of machine learning algorithms after down-sampling. Although the important features extracted from our best performing model are not conserved between the two technologies, the important features extracted from either instance indicate the utility of machine learning algorithms in identifying biologically meaningful relationships between the host and microbiome community members. In conclusion, this work provides the first systematic machine learning comparison of dog 16S and WGS microbiomes derived from identical study designs.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202109179247440ZK.pdf 2772KB PDF download
  文献评价指标  
  下载次数:5次 浏览次数:12次