期刊论文详细信息
BMC Medical Informatics and Decision Making
A comprehensive tool for creating and evaluating privacy-preserving biomedical prediction models
Fabian Prasser1  Johanna Eicher2  Helmut Spengler2  Klaus A. Kuhn2  Raffael Bild2 
[1] Berlin Institute of Health (BIH);School of Medicine, Technical University of Munich;
关键词: Biomedical data;    Prediction models;    Machine learning;    Classification;    Privacy protection;    Data anonymization;   
DOI  :  10.1186/s12911-020-1041-3
来源: DOAJ
【 摘 要 】

Abstract Background Modern data driven medical research promises to provide new insights into the development and course of disease and to enable novel methods of clinical decision support. To realize this, machine learning models can be trained to make predictions from clinical, paraclinical and biomolecular data. In this process, privacy protection and regulatory requirements need careful consideration, as the resulting models may leak sensitive personal information. To counter this threat, a wide range of methods for integrating machine learning with formal methods of privacy protection have been proposed. However, there is a significant lack of practical tools to create and evaluate such privacy-preserving models. In this software article, we report on our ongoing efforts to bridge this gap. Results We have extended the well-known ARX anonymization tool for biomedical data with machine learning techniques to support the creation of privacy-preserving prediction models. Our methods are particularly well suited for applications in biomedicine, as they preserve the truthfulness of data (e.g. no noise is added) and they are intuitive and relatively easy to explain to non-experts. Moreover, our implementation is highly versatile, as it supports binomial and multinomial target variables, different types of prediction models and a wide range of privacy protection techniques. All methods have been integrated into a sound framework that supports the creation, evaluation and refinement of models through intuitive graphical user interfaces. To demonstrate the broad applicability of our solution, we present three case studies in which we created and evaluated different types of privacy-preserving prediction models for breast cancer diagnosis, diagnosis of acute inflammation of the urinary system and prediction of the contraceptive method used by women. In this process, we also used a wide range of different privacy models (k-anonymity, differential privacy and a game-theoretic approach) as well as different data transformation techniques. Conclusions With the tool presented in this article, accurate prediction models can be created that preserve the privacy of individuals represented in the training set in a variety of threat scenarios. Our implementation is available as open source software.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次