期刊论文详细信息
Frontiers in Neurorobotics
Scaled Free-Energy Based Reinforcement Learning for Robust and Efficient Learning in High-Dimensional State Spaces
Stefan eElfwing1  Eiji eUchibe1  Kenji eDoya1 
[1] Okinawa Institute of Science and Technology Graduate University;
关键词: reinforcement learning;    free energy;    Function Approximation;    Restricted Boltzmann Machine;    robot navigation;   
DOI  :  10.3389/fnbot.2013.00003
来源: DOAJ
【 摘 要 】

Free-energy based reinforcement learning was proposed for learning in high-dimensional state- and action spaces, which cannot be handled by standard function approximation methods. In this study, we propose a scaled version of free-energy based reinforcement learning to achieve more robust and more efficient learning performance. The action-value function is approximated by the negative free energy of a restricted Boltzmann machine, divided by a constant scaling factor that is related to the size of the Boltzmann machine (the square root of the number of state nodes in this study). Our first task is a digit floor gridworld task, where the states are represented by images of handwritten digits from the MNIST data set. The purpose of the task is to investigate the proposed method's ability, through the extraction of task-relevant features in the hidden layer, to cluster images of the same digitand to cluster images of different digits that corresponds to states with the same optimal action. We also test the method's robustness with respect to different exploration schedules, i.e., different settings of the initial temperature and the temperature discount rate in softmax action selection. Our second task is a robot visual navigation task, where the robot can learn its position by the different colors of the lower part of four landmarks and it can infer the correct corner goal area by the color of the upper part of the landmarks. The state space consists of binarized camera images with, at most, nine different colors, which is equal to 6642 binary states. For both tasks, the learning performance is compared with standard free-energy based reinforcement learning and with function approximation where the action-value function is approximated by a two-layered feedforward neural network.

【 授权许可】

Unknown   

  文献评价指标  
  下载次数:0次 浏览次数:0次