学位论文

学位论文详细信息


Solving planning problems with deep reinforcement learning and tree search
reinforcement learning;mcts;sokoban;a*;heuristic
Ge, Victor ; Lazebnik ; Svetlana
关键词: reinforcement learning; mcts; sokoban; a*; heuristic;
Others : https://www.ideals.illinois.edu/bitstream/handle/2142/101086/GE-THESIS-2018.pdf?sequence=1&isAllowed=y
美国\|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF

【摘要】

Deep reinforcement learning methods are capable of learning complex heuristics starting with no prior knowledge, but struggle in environments where the learning signal is sparse. In contrast, planning methods can discover the optimal path to a goal in the absence of external rewards, but often require a hand-crafted heuristic function to be effective. In this thesis, we describe a model-based reinforcement learning method that bridges the middle ground between these two approaches. When evaluated on the complex domain of Sokoban, the model-based method was found to be more performant, stable and sample-efficient than a model-free baseline.

【预览】

附件列表
Files	Size	Format	View
Solving planning problems with deep reinforcement learning and tree search	656KB	PDF	download


	文献评价指标
	下载次数：46次	浏览次数：15次

京公网安备340104078870146号 878987797 028-85220240

OAinOne平台基于对开放资源的发现、遴选和评价方式，发现、获取、集成9类优质的开放科技资源，包括开放期刊、开放会议论文、开放课件、科技政策、开放学位论文、开放图书、开放科技报告、科研项目、开放科学数据。同时，为实现开放知识资源普遍服务、个性化服务、精准服务，基于OAinONE集成的丰富开放资源，开发建设领域开放知识资源服务定制工具(OAtoYOU)、开放资源评价评估体系（OAEvaluation），建设集成OAinONE资源及其他第三方资源的OA Hub，及其面向我院分布式大数据知识资源系统及其他第三方的开放接口服务，并打造特色专题数据库产品建设，包括科技政策集成及趋势平台、开放课程大讲堂等。此外，OAinOne构建开放知识资源建设的可持续发展机制，支持我院研究所特色馆藏资源、自建资源、古籍资源等在OAinONE平台上的集成、开放、共享。

【 摘 要 】

【 预 览 】

【摘要】

【预览】