| Health Technology Assessment | |
| An observational study to assess if automated diabetic retinopathy image assessment software can replace one or more steps of manual imaging grading and to determine their cost-effectiveness | |
| Clare Bailey1  Paul Taylor2  Caroline Rudisill3  Sebastian Salas-Vega3  SriniVas Sadda4  Louis Bolter5  John Anderson5  Adnan Tufail6  Vern Louw6  Gerald Liew6  Aaron Lee6  Catherine Egan6  Alicja R Rudnicka7  Venediktos V Kapetanakis7  Christopher G Owen7  | |
| [1] Bristol Eye Hospital, Bristol, UK;Centre for Health Informatics & Multiprofessional Education (CHIME), Institute of Health Informatics, University College London, London, UK;Department of Social Policy, LSE Health, London School of Economics and Political Science, London, UK;Doheny Eye Institute, Los Angeles, CA, USA;Homerton University Hospital Foundation Trust, London, UK;National Institute for Health Research Moorfields Biomedical Research Centre, Moorfields Eye Hospital, London, UK;Population Health Research Institute, St George’s, University of London, London, UK; | |
| 关键词: diabetes mellitus; diabetic retinopathy; digital image; screening; validation; automatic classification; sensitivity; specificity; detection; health economics; cost-effectiveness; machine learning; | |
| DOI : 10.3310/hta20920 | |
| 来源: DOAJ | |
【 摘 要 】
Background: Diabetic retinopathy screening in England involves labour-intensive manual grading of retinal images. Automated retinal image analysis systems (ARIASs) may offer an alternative to manual grading. Objectives: To determine the screening performance and cost-effectiveness of ARIASs to replace level 1 human graders or pre-screen with ARIASs in the NHS diabetic eye screening programme (DESP). To examine technical issues associated with implementation. Design: Observational retrospective measurement comparison study with a real-time evaluation of technical issues and a decision-analytic model to evaluate cost-effectiveness. Setting: A NHS DESP. Participants: Consecutive diabetic patients who attended a routine annual NHS DESP visit. Interventions: Retinal images were manually graded and processed by three ARIASs: iGradingM (version 1.1; originally Medalytix Group Ltd, Manchester, UK, but purchased by Digital Healthcare, Cambridge, UK, at the initiation of the study, purchased in turn by EMIS Health, Leeds, UK, after conclusion of the study), Retmarker (version 0.8.2, Retmarker Ltd, Coimbra, Portugal) and EyeArt (Eyenuk Inc., Woodland Hills, CA, USA). The final manual grade was used as the reference standard. Arbitration on a subset of discrepancies between manual grading and the use of an ARIAS by a reading centre masked to all grading was used to create a reference standard manual grade modified by arbitration. Main outcome measures: Screening performance (sensitivity, specificity, false-positive rate and likelihood ratios) and diagnostic accuracy [95% confidence intervals (CIs)] of ARIASs. A secondary analysis explored the influence of camera type and patients’ ethnicity, age and sex on screening performance. Economic analysis estimated the cost per appropriate screening outcome identified. Results: A total of 20,258 patients with 102,856 images were entered into the study. The sensitivity point estimates of the ARIASs were as follows: EyeArt 94.7% (95% CI 94.2% to 95.2%) for any retinopathy, 93.8% (95% CI 92.9% to 94.6%) for referable retinopathy and 99.6% (95% CI 97.0% to 99.9%) for proliferative retinopathy; and Retmarker 73.0% (95% CI 72.0% to 74.0%) for any retinopathy, 85.0% (95% CI 83.6% to 86.2%) for referable retinopathy and 97.9% (95% CI 94.9 to 99.1%) for proliferative retinopathy. iGradingM classified all images as either ‘disease’ or ‘ungradable’, limiting further iGradingM analysis. The sensitivity and false-positive rates for EyeArt were not affected by ethnicity, sex or camera type but sensitivity declined marginally with increasing patient age. The screening performance of Retmarker appeared to vary with patient’s age, ethnicity and camera type. Both EyeArt and Retmarker were cost saving relative to manual grading either as a replacement for level 1 human grading or used prior to level 1 human grading, although the latter was less cost-effective. A threshold analysis testing the highest ARIAS cost per patient before which ARIASs became more expensive per appropriate outcome than human grading, when used to replace level 1 grader, was Retmarker £3.82 and EyeArt £2.71 per patient. Limitations: The non-randomised study design limited the health economic analysis but the same retinal images were processed by all ARIASs in this measurement comparison study. Conclusions: Retmarker and EyeArt achieved acceptable sensitivity for referable retinopathy and false-positive rates (compared with human graders as reference standard) and appear to be cost-effective alternatives to a purely manual grading approach. Future work is required to develop technical specifications to optimise deployment and address potential governance issues. Funding: The National Institute for Health Research (NIHR) Health Technology Assessment programme, a Fight for Sight Grant (Hirsch grant award) and the Department of Health’s NIHR Biomedical Research Centre for Ophthalmology at Moorfields Eye Hospital and the University College London Institute of Ophthalmology.
【 授权许可】
Unknown