期刊论文

【摘要】

BackgroundThe COVID-19 pandemic has led to an unprecedented amount of scientific publications, growing at a pace never seen before. Multiple living systematic reviews have been developed to assist professionals with up-to-date and trustworthy health information, but it is increasingly challenging for systematic reviewers to keep up with the evidence in electronic databases. We aimed to investigate deep learning-based machine learning algorithms to classify COVID-19-related publications to help scale up the epidemiological curation process.MethodsIn this retrospective study, five different pre-trained deep learning-based language models were fine-tuned on a dataset of 6365 publications manually classified into two classes, three subclasses, and 22 sub-subclasses relevant for epidemiological triage purposes. In a k-fold cross-validation setting, each standalone model was assessed on a classification task and compared against an ensemble, which takes the standalone model predictions as input and uses different strategies to infer the optimal article class. A ranking task was also considered, in which the model outputs a ranked list of sub-subclasses associated with the article.ResultsThe ensemble model significantly outperformed the standalone classifiers, achieving a F1-score of 89.2 at the class level of the classification task. The difference between the standalone and ensemble models increases at the sub-subclass level, where the ensemble reaches a micro F1-score of 70% against 67% for the best-performing standalone model. For the ranking task, the ensemble obtained the highest recall@3, with a performance of 89%. Using an unanimity voting rule, the ensemble can provide predictions with higher confidence on a subset of the data, achieving detection of original papers with a F1-score up to 97% on a subset of 80% of the collection instead of 93% on the whole dataset.ConclusionThis study shows the potential of using deep learning language models to perform triage of COVID-19 references efficiently and support epidemiological curation and review. The ensemble consistently and significantly outperforms any standalone model. Fine-tuning the voting strategy thresholds is an interesting alternative to annotate a subset with higher predictive confidence.

【授权许可】

CC BY
© The Author(s) 2023

【预览】

附件列表
Files	Size	Format	View
RO202309076622121ZK.pdf	3898KB	PDF	download
Fig. 2	221KB	Image	download
MediaObjects/13046_2023_2715_MOESM5_ESM.pdf	1570KB	PDF	download
Fig. 1	396KB	Image	download
Fig. 4	3514KB	Image	download
Fig. 1	90KB	Image	download
42004_2023_911_Article_IEq33.gif	1KB	Image	download
Fig. 1	2421KB	Image	download
Fig. 2	182KB	Image	download
Fig. 2	40KB	Image	download
MediaObjects/13690_2023_1131_MOESM1_ESM.docx	36KB	Other	download

【图表】

Fig. 2

Fig. 2

Fig. 1

42004_2023_911_Article_IEq33.gif

Fig. 1

Fig. 4

Fig. 1

Fig. 2

【参考文献】

[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]

Systematic Reviews
Ensemble of deep learning language models to support the creation of living systematic reviews for the COVID-19 literature
Research
Aziz Mert Ipekci¹ Nicola Low¹ Diana Buitrago-Garcia¹ Leonie Heron¹ Hira Imeri¹ Michel Counotte² Quentin Haas³ Poorya Amini⁴ Julien Knafou⁵ Nikolay Borissov⁶ Douglas Teodoro⁷
[1] Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland;Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland;Wageningen Bioveterinary Research, Wageningen University & Research, Wageningen, The Netherlands;Risklick AG, Bern, Switzerland;Risklick AG, Bern, Switzerland;CTU Bern, University of Bern, Bern, Switzerland;University of Applied Sciences and Arts of Western Switzerland (HES-SO), Rue de la Tambourine 17, 1227, Geneva, Switzerland;University of Applied Sciences and Arts of Western Switzerland (HES-SO), Rue de la Tambourine 17, 1227, Geneva, Switzerland;CTU Bern, University of Bern, Bern, Switzerland;University of Applied Sciences and Arts of Western Switzerland (HES-SO), Rue de la Tambourine 17, 1227, Geneva, Switzerland;Department of Radiology and Medical Informatics, University of Geneva, Geneva, Switzerland;
关键词: COVID-19; Living systematic review; Literature screening; Text classification; Language model; Deep learning; Transfer learning;
DOI : 10.1186/s13643-023-02247-9
received in 2022-07-25, accepted in 2023-04-24, 发布年份 2023
来源: Springer
PDF


	文献评价指标
	下载次数：10次	浏览次数：0次

【 摘 要 】

【 授权许可】

【 预 览 】

【 图 表 】

【 参考文献 】

【摘要】

【授权许可】

【预览】

【图表】

【参考文献】