| NEUROCOMPUTING | 卷:450 |
| Delay-aware model-based reinforcement learning for continuous control | |
| Article | |
| Chen, Baiming1  Xu, Mengdi2  Li, Liang1  Zhao, Ding2  | |
| [1] Tsinghua Univ, Beijing 100084, Peoples R China | |
| [2] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA | |
| 关键词: Model-based reinforcement learning; Markov decision process; Continuous control; Delayed system; | |
| DOI : 10.1016/j.neucom.2021.04.015 | |
| 来源: Elsevier | |
PDF
|
|
【 摘 要 】
Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition of delay-aware Markov Decision Process and proves it can be transformed into standard MDP with augmented states using the Markov reward process. We develop a delay-aware model-based reinforcement learning framework that can incorporate the multi-step delay into the learned system models without learning effort. Experiments with the Gym and MuJoCo platforms show that the proposed delay-aware model-based algorithm is more efficient in training and transferable between systems with various durations of delay compared with state-of-the-art model-free reinforce-ment learning methods. (c) 2021 Elsevier B.V. All rights reserved.
【 授权许可】
Free
【 预 览 】
| Files | Size | Format | View |
|---|---|---|---|
| 10_1016_j_neucom_2021_04_015.pdf | 1452KB |
PDF