Genome Medicine | |
Systematic comparison of published host gene expression signatures for bacterial/viral discrimination | |
Geoffrey S. Ginsburg1  Micah T. McClain2  Ephraim L. Tsalik2  Christopher W. Woods2  Ricardo Henao3  Emily R. Ko4  Melissa Ross5  Nicholas Bodkin6  | |
[1] Duke Center for Applied Genomics and Precision Medicine, Duke University School of Medicine, Durham, NC, USA;All of Us Research Program, National Institutes of Health, Bethesda, MD, USA;Duke Center for Applied Genomics and Precision Medicine, Duke University School of Medicine, Durham, NC, USA;Division of Infectious Diseases, Department of Medicine, Duke University School of Medicine, Durham, NC, USA;Durham VA Health Care System, Durham, NC, USA;Duke Center for Applied Genomics and Precision Medicine, Duke University School of Medicine, Durham, NC, USA;Duke University Department of Biostatistics and Informatics, Durham, NC, USA;Duke Center for Applied Genomics and Precision Medicine, Duke University School of Medicine, Durham, NC, USA;Durham Regional Hospital, Durham, NC, USA;Duke University School of Medicine, Durham, NC, USA;Duke University Trinity College of Arts and Sciences, Durham, NC, USA; | |
关键词: Biomarkers; Infectious disease; Diagnostics; Gene expression; Machine learning; | |
DOI : 10.1186/s13073-022-01025-x | |
来源: Springer | |
【 摘 要 】
BackgroundMeasuring host gene expression is a promising diagnostic strategy to discriminate bacterial and viral infections. Multiple signatures of varying size, complexity, and target populations have been described. However, there is little information to indicate how the performance of various published signatures compare to one another.MethodsThis systematic comparison of host gene expression signatures evaluated the performance of 28 signatures, validating them in 4589 subjects from 51 publicly available datasets. Thirteen COVID-specific datasets with 1416 subjects were included in a separate analysis. Individual signature performance was evaluated using the area under the receiving operating characteristic curve (AUC) value. Overall signature performance was evaluated using median AUCs and accuracies.ResultsSignature performance varied widely, with median AUCs ranging from 0.55 to 0.96 for bacterial classification and 0.69–0.97 for viral classification. Signature size varied (1–398 genes), with smaller signatures generally performing more poorly (P < 0.04). Viral infection was easier to diagnose than bacterial infection (84% vs. 79% overall accuracy, respectively; P < .001). Host gene expression classifiers performed more poorly in some pediatric populations (3 months–1 year and 2–11 years) compared to the adult population for both bacterial infection (73% and 70% vs. 82%, respectively; P < .001) and viral infection (80% and 79% vs. 88%, respectively; P < .001). We did not observe classification differences based on illness severity as defined by ICU admission for bacterial or viral infections. The median AUC across all signatures for COVID-19 classification was 0.80 compared to 0.83 for viral classification in the same datasets.ConclusionsIn this systematic comparison of 28 host gene expression signatures, we observed differences based on a signature’s size and characteristics of the validation population, including age and infection type. However, populations used for signature discovery did not impact performance, underscoring the redundancy among many of these signatures. Furthermore, differential performance in specific populations may only be observable through this type of large-scale validation.
【 授权许可】
CC BY
【 预 览 】
Files | Size | Format | View |
---|---|---|---|
RO202202182786480ZK.pdf | 1487KB | download |