学位论文详细信息
Computationally Efficient Relational Reinforcement Learning
Relational Reinforcement Learning;Rete;Adaptive Tile Coding;Online Learning;Sequential Decision Making;Computer Science;Engineering;Computer Science & Engineering
Bloch, MitchellDurfee, Edmund H ;
University of Michigan
关键词: Relational Reinforcement Learning;    Rete;    Adaptive Tile Coding;    Online Learning;    Sequential Decision Making;    Computer Science;    Engineering;    Computer Science & Engineering;   
Others  :  https://deepblue.lib.umich.edu/bitstream/handle/2027.42/145859/bazald_1.pdf?sequence=1&isAllowed=y
瑞士|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Relational Reinforcement Learning (RRL) is a technique that enables Reinforcement Learning (RL) agents to generalize from their experience, allowing them to learn over large or potentially infinite state spaces, to learn context sensitive behaviors, and to learn to solve variable goals and to transfer knowledge between similar situations. Prior RRL architectures are not sufficiently computationally efficient to see use outside of small, niche roles within larger Artificial Intelligence (AI) architectures. I present a novel online, incremental RRL architecture and an implementation that is orders of magnitude faster than its predecessors. The first aspect of this architecture that I explore is a computationally efficient implementation of an adaptive Hierarchical Tile Coding (aHTC), a kind of Adaptive Tile Coding (ATC) in which more general tiles which cover larger portions of the state-action space are kept as ones that cover smaller portions of the state-action space are introduced, using k-dimensional tries (k-d tries) to implement the value function for non-relational Temporal Difference (TD) methods. In order to achieve comparable performance for RRL, I implement the Rete algorithm to replace my k-d tries due to its efficient handling of both the variable binding problem and variable numbers of actions. Tying aHTCs and Rete together, I present a rule grammar that both maps aHTCs onto Rete and allows the architecture to automatically extract relational features in order to support adaptation of the value function over time. I experiment with several refinement criteria and additional functionality with which my agents attempt to determine if rerefinement using different features might allow them to better learn a near optimal policy. I present optimal results using a value criterion for several variants of BlocksWorld. I provide transfer results for BlocksWorld and a scalable Taxicab domain. I additionally introduce a Higher Order Grammar (HOG) that grants online, incremental RRL agents additional flexibility to introduce additional variables and corresponding relations as needed in order to learn effective value functions. I evaluate agents that use the HOG on a version of Blocks World and on an Adventure task. In summary, I present a new online, incremental RRL architecture, a grammar to map aHTCs onto the Rete, and an implementation that is orders of magnitude faster than its predecessors.

【 预 览 】
附件列表
Files Size Format View
Computationally Efficient Relational Reinforcement Learning 4836KB PDF download
  文献评价指标  
  下载次数:11次 浏览次数:30次