期刊论文详细信息
Systematic Reviews
Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records
Research
Gerbrich Ferdinands1  Jelle Jasper Teijema1  Ayoub Bagheri1  Daniel L. Oberski1  Rens van de Schoot1  Raoul Schram2  Jonathan de Bruin2  Lars Tummers3 
[1] Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, Netherlands;Department of Research and Data Management Services, Information Technology Services, Utrecht University, Utrecht, The Netherlands;School of Governance, Faculty of Law, Economics and Governance, Utrecht University, Utrecht, The Netherlands;
关键词: Systematic reviews;    Active learning;    Screening prioritization;    Machine learning;    Data mining;    Computer simulation;   
DOI  :  10.1186/s13643-023-02257-7
 received in 2020-08-20, accepted in 2023-05-16,  发布年份 2023
来源: Springer
PDF
【 摘 要 】

BackgroundConducting a systematic review demands a significant amount of effort in screening titles and abstracts. To accelerate this process, various tools that utilize active learning have been proposed. These tools allow the reviewer to interact with machine learning software to identify relevant publications as early as possible. The goal of this study is to gain a comprehensive understanding of active learning models for reducing the workload in systematic reviews through a simulation study.MethodsThe simulation study mimics the process of a human reviewer screening records while interacting with an active learning model. Different active learning models were compared based on four classification techniques (naive Bayes, logistic regression, support vector machines, and random forest) and two feature extraction strategies (TF-IDF and doc2vec). The performance of the models was compared for six systematic review datasets from different research areas. The evaluation of the models was based on the Work Saved over Sampling (WSS) and recall. Additionally, this study introduces two new statistics, Time to Discovery (TD) and Average Time to Discovery (ATD).ResultsThe models reduce the number of publications needed to screen by 91.7 to 63.9% while still finding 95% of all relevant records (WSS@95). Recall of the models was defined as the proportion of relevant records found after screening 10% of of all records and ranges from 53.6 to 99.8%. The ATD values range from 1.4% till 11.7%, which indicate the average proportion of labeling decisions the researcher needs to make to detect a relevant record. The ATD values display a similar ranking across the simulations as the recall and WSS values.ConclusionsActive learning models for screening prioritization demonstrate significant potential for reducing the workload in systematic reviews. The Naive Bayes + TF-IDF model yielded the best results overall. The Average Time to Discovery (ATD) measures performance of active learning models throughout the entire screening process without the need for an arbitrary cut-off point. This makes the ATD a promising metric for comparing the performance of different models across different datasets.

【 授权许可】

CC BY   
© The Author(s) 2023

【 预 览 】
附件列表
Files Size Format View
RO202309072651802ZK.pdf 2048KB PDF download
Fig. 7 112KB Image download
MediaObjects/40360_2023_664_MOESM2_ESM.docx 103KB Other download
Fig. 13 799KB Image download
MediaObjects/40798_2023_591_MOESM2_ESM.docx 23KB Other download
MediaObjects/40798_2023_591_MOESM3_ESM.docx 26KB Other download
MediaObjects/40798_2023_591_MOESM5_ESM.docx 54KB Other download
MediaObjects/40360_2023_664_MOESM6_ESM.docx 95KB Other download
Fig. 1 228KB Image download
Fig. 2 1053KB Image download
Fig. 4 1761KB Image download
Fig. 2 302KB Image download
Fig. 1 818KB Image download
Fig. 3 968KB Image download
12936_2023_4634_Article_IEq2.gif 1KB Image download
MediaObjects/12888_2023_4879_MOESM1_ESM.doc 416KB Other download
MediaObjects/12902_2023_1381_MOESM1_ESM.docx 16KB Other download
12936_2023_4634_Article_IEq5.gif 1KB Image download
Fig. 2 576KB Image download
【 图 表 】

Fig. 2

12936_2023_4634_Article_IEq5.gif

12936_2023_4634_Article_IEq2.gif

Fig. 3

Fig. 1

Fig. 2

Fig. 4

Fig. 2

Fig. 1

Fig. 13

Fig. 7

【 参考文献 】
  • [1]
  • [2]
  • [3]
  • [4]
  • [5]
  • [6]
  • [7]
  • [8]
  • [9]
  • [10]
  • [11]
  • [12]
  • [13]
  • [14]
  • [15]
  • [16]
  • [17]
  • [18]
  • [19]
  • [20]
  • [21]
  • [22]
  • [23]
  • [24]
  • [25]
  • [26]
  • [27]
  • [28]
  • [29]
  • [30]
  • [31]
  • [32]
  • [33]
  • [34]
  • [35]
  • [36]
  • [37]
  • [38]
  • [39]
  • [40]
  • [41]
  • [42]
  • [43]
  • [44]
  • [45]
  • [46]
  • [47]
  • [48]
  • [49]
  • [50]
  • [51]
  • [52]
  • [53]
  • [54]
  • [55]
  • [56]
  • [57]
  • [58]
  • [59]
  • [60]
  • [61]
  • [62]
  文献评价指标  
  下载次数:10次 浏览次数:2次