Dynamic treatment regimes (DTRs) are sequential decision rules that focus simultaneously on treatment individualization and adaptation over time. In Project 1, we consider identifying the optimal personalized timing for treatment initiation. Instead of considering multiple fixed decision stages as in most DTR literature, we deal with random, possibly continuous, decision points for treatment initiation given each patient;;s disease and treatment history. For a set of predefined candidate DTRs, we fit a flexible survival model with splines of time-varying covariates to estimate patient-specific probabilities of adherence to each DTR. Then we employ an inverse probability weighted estimator for the counterfactual mean utility to assess each DTR and identify the optimal one. In Project 2, we propose a dynamic statistical learning method, adaptive contrast weighted learning (ACWL), to explore optimal DTRs without prespecifying candidates. ACWL can handle multiple treatments at a fixed number of stages. At each stage, we develop semiparametric regression-based contrasts with the adaptation of treatment effect ordering for each patient. The adaptive contrasts simplify the problem of optimization with multiple treatment comparisons to a weighted classification problem that can be solved by existing machine learning techniques. The algorithm is implemented recursively using backward induction. Through simulation studies, we show that the proposed method is robust and efficient for the identification of optimal DTRs. In Project 3, we propose a tree-based reinforcement learning (T-RL) method to directly estimate optimal DTRs in a multi-stage multi-treatment setting. At each stage, T-RL builds an unsupervised decision tree that maintains the nature of batch-mode reinforcement learning. Unlike ACWL, T-RL directly handles the problem of optimization with multiple treatment comparisons, through the purity measure constructed with semiparametric regression estimators. For multiple stages, the algorithm is implemented recursively using backward induction. By combining robust semiparametric regression with flexible tree-based learning, we show that T-RL is robust, efficient and easy to interpret for the identification of optimal DTRs.
【 预 览 】
附件列表
Files
Size
Format
View
Semiparametric Regression and Machine Learning Methods for Estimating Optimal Dynamic Treatment Regimes.