期刊论文详细信息
BMC Medical Informatics and Decision Making
A validated natural language processing algorithm for brain imaging phenotypes from radiology reports in UK electronic health records
  1    1    2    3    4    4 
[1] 0000 0004 1936 7988, grid.4305.2, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK;0000 0004 1936 7988, grid.4305.2, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK;0000 0004 1936 7988, grid.4305.2, Centre for Medical Informatics, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, UK;Health Data Research UK Scotland, Edinburgh, UK;0000 0004 1936 7988, grid.4305.2, Centre for Clinical Brain Sciences, University of Edinburgh, Edinburgh, UK;0000 0004 1936 8948, grid.4991.5, Nuffield Department of Population Health, University of Oxford, Oxford, UK;The Alan Turing Institute, British Library, 96 Euston Road, London, UK;0000 0004 1936 7988, grid.4305.2, Institute for Language, Cognition and Computation, School of Informatics, University of Edinburgh, Edinburgh, UK;
关键词: Radiology;    Natural language processing;    Brain imaging;    Phenotyping;    Radiology reports;    Stroke;   
DOI  :  10.1186/s12911-019-0908-7
来源: publisher
PDF
【 摘 要 】

BackgroundManual coding of phenotypes in brain radiology reports is time consuming. We developed a natural language processing (NLP) algorithm to enable automatic identification of brain imaging in radiology reports performed in routine clinical practice in the UK National Health Service (NHS).MethodsWe used anonymized text brain imaging reports from a cohort study of stroke/TIA patients and from a regional hospital to develop and test an NLP algorithm. Two experts marked up text in 1692 reports for 24 cerebrovascular and other neurological phenotypes. We developed and tested a rule-based NLP algorithm first within the cohort study, and further evaluated it in the reports from the regional hospital.ResultsThe agreement between expert readers was excellent (Cohen’s κ =0.93) in both datasets. In the final test dataset (n = 700) in unseen regional hospital reports, the algorithm had very good performance for a report of any ischaemic stroke [sensitivity 89% (95% CI:81–94); positive predictive value (PPV) 85% (76–90); specificity 100% (95% CI:0.99–1.00)]; any haemorrhagic stroke [sensitivity 96% (95% CI: 80–99), PPV 72% (95% CI:55–84); specificity 100% (95% CI:0.99–1.00)]; brain tumours [sensitivity 96% (CI:87–99); PPV 84% (73–91); specificity: 100% (95% CI:0.99–1.00)] and cerebral small vessel disease and cerebral atrophy (sensitivity, PPV and specificity all > 97%). We obtained few reports of subarachnoid haemorrhage, microbleeds or subdural haematomas. In 110,695 reports from NHS Tayside, atrophy (n = 28,757, 26%), small vessel disease (15,015, 14%) and old, deep ischaemic strokes (10,636, 10%) were the commonest findings.ConclusionsAn NLP algorithm can be developed in UK NHS radiology records to allow identification of cohorts of patients with important brain imaging phenotypes at a scale that would otherwise not be possible.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO201909248012123ZK.pdf 1326KB PDF download
  文献评价指标  
  下载次数:4次 浏览次数:10次