期刊论文详细信息
International Journal of Advanced Robotic Systems
A pretrained proximal policy optimization algorithm with reward shaping for aircraft guidance to a moving destination in three-dimensional continuous space
article
Zhuang Wang1  Hui Li1  Zhaoxin Wu2  Haolin Wu1 
[1] College of Computer Science, Sichuan University;National Key Laboratory of Fundamental Science on Synthetic Vision, Sichuan University
关键词: Aircraft guidance;    deep reinforcement learning;    PPO;    reward shaping;   
DOI  :  10.1177/1729881421989546
学科分类:社会科学、人文和艺术(综合)
来源: InTech
PDF
【 摘 要 】

To enhance the performance of guiding an aircraft to a moving destination in a certain direction in three-dimensional continuous space, it is essential to develop an efficient intelligent algorithm. In this article, a pretrained proximal policy optimization (PPO) with reward shaping algorithm, which does not require an accurate model, is proposed to solve the guidance problem of manned aircraft and unmanned aerial vehicles. Continuous action reward function and position reward function are presented, by which the training speed is increased and the performance of the generated trajectory is improved. Using pretrained PPO, a new agent can be trained efficiently for a new task. A reinforcement learning framework is built, in which an agent can be trained to generate a reference trajectory or a series of guidance instructions. General simulation results show that the proposed method can significantly improve the training efficiency and trajectory performance. The carrier-based aircraft approach simulation is carried out to prove the application value of the proposed approach.

【 授权许可】

CC BY   

【 预 览 】
附件列表
Files Size Format View
RO202108130004925ZK.pdf 948KB PDF download
  文献评价指标  
  下载次数:3次 浏览次数:1次