Introduction: There are many problems with current state-of-the-art protocols for maintenance dosing of the oral anticoagulant agent warfarin used in clinical practice. The two key challenges include lack of personalized dose adjustment and the high cost of monitoring the efficacy of the therapy in the form of International Normalized Ratio (INR) measurements. A new dosing algorithm based on the principles of Reinforcement Learning (RL), specifically Q-Learning with functional policy approximation, was created to personalize maintenance dosing of warfarin based on observed INR and to optimize the length of time between INR measurements. This new method will help improve patient’s INR time in therapeutic range (TTR) as well as minimize cost associated with monitoring INR when compared to the current standard of care. Procedure: Using the principles of Reinforcement Learning, an algorithm to control warfarin dosing was created. The algorithm uses 9 different controllers which correspond to 9 different levels of warfarin sensitivity. The algorithm switches between controllers until it selects the controller that most closely resembles the individual patient’s response, and thus the optimal dose change (?Dose) and time between INR measurements (?Time) are personalized for each patient, based on INR observed in the patient. Three simulations were performed using data from 100 artificial patients, generated based on data from real patients, each. The first simulation that was performed was an ideal case scenario (clean simulation where the coefficient of variance (CV) of noise added to the model output = 0) using only the warfarin RL algorithm to prove efficacy. The second simulation was performed using the current standard of care and a CV = 25% to simulate intra-patient variability. The third simulation was performed using the warfarin RL algorithm with a CV = 25%. 180 days were simulated for each patient in each simulation and the measurements that were used to benchmark the efficacy of the therapy were INR time in therapeutic range (TTR) and the number of INR measurements that were taken during simulation. Results: The first simulation yielded a mean TTR = 92.1% with a standard deviation of 4.2%, and had a mean number of INR measurements = 7.94 measurements/patient. The second simulation yielded a mean TTR = 45.3% with a standard deviation of 16.4%, and had a mean number of INR measurements = 12.3 measurements/patient. The third simulation yielded a mean TTR = 51.8% with a standard deviation of 10.8%, and had a mean number of
【 预 览 】
附件列表
Files
Size
Format
View
Personalized anticoagulant management using reinforcement learning.