期刊论文

【摘要】

Considering the dynamics and non-linear characteristics of biped robots, gait optimization is an extremely challenging task. To tackle this issue, a parallel heterogeneous policy Deep Reinforcement Learning (DRL) algorithm for gait optimization is proposed. Firstly, the Deep Deterministic Policy Gradient (DDPG) algorithm is used as the main architecture to run multiple biped robots in parallel to interact with the environment. And the network is shared to improve the training efficiency. Furthermore, heterogeneous experience replay is employed instead of the traditional experience replay mechanism to optimize the utilization of experience. Secondly, according to the walking characteristics of biped robots, a biped robot periodic gait is designed with reference to sinusoidal curves. The periodic gait takes into account the effects of foot lift height, walking period, foot lift speed and ground contact force of the biped robot. Finally, different environments and different biped robot models pose challenges for different optimization algorithms. Thus, a unified gait optimization framework for biped robots based on the RoboCup3D platform is established. Comparative experiments were conducted using the unified gait optimization framework, and the experimental results show that the method outlined in this paper can make the biped robot walk faster and more stably.

【授权许可】

Unknown
Copyright © 2023 Li, Li and Tao.

【预览】

附件列表
Files	Size	Format	View
RO202310101907371ZK.pdf	3288KB	PDF	download
fcomp-05-1085867-i0001.tif	25KB	Image	download
Algorithm 2	362KB	Table	download

【图表】

fcomp-05-1085867-i0001.tif

Frontiers in Neurorobotics
A parallel heterogeneous policy deep reinforcement learning algorithm for bipedal walking motion design
Neuroscience
Chunguang Li¹ Chongben Tao² Mengru Li²
[1] School of Computer and Information Engineering, Changzhou Institute of Technology, Changzhou, Jiangsu, China;School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China;
关键词: gait optimization; biped robot; Deep Deterministic Policy Gradient; experience replay; parallel heterogeneous strategy;
DOI : 10.3389/fnbot.2023.1205775
received in 2023-04-14, accepted in 2023-07-25, 发布年份 2023
来源: Frontiers
PDF


	文献评价指标
	下载次数：21次	浏览次数：0次

【 摘 要 】

【 授权许可】

【 预 览 】

【 图 表 】

【摘要】

【授权许可】

【预览】

【图表】