期刊论文详细信息
卷:322
Certified reinforcement learning with logic guidance
Article
关键词: MARKOV DECISION-PROCESSES;    DISCRETE-TIME;    STATE;    SAFETY;    APPROXIMATION;    VERIFICATION;    REACHABILITY;    CHECKING;    NETWORKS;   
DOI  :  10.1016/j.artint.2023.103949
来源: SCIE
【 摘 要 】

Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied to a variety of control problems. However, applications in safety-critical domains require a systematic and formal approach to specifying requirements as tasks or goals. We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs). The given LTL property is translated into a Limit-Deterministic Generalised Buchi Automaton (LDGBA), which is then used to shape a synchronous reward function on-thefly. Under certain assumptions, the algorithm is guaranteed to synthesise a control policy whose traces satisfy the LTL specification with maximal probability.& COPY; 2023 Elsevier B.V. All rights reserved.

【 授权许可】

Free   

  文献评价指标  
  下载次数:0次 浏览次数:0次