Zhongguo Jianchuan Yanjiu | |
Intelligent decision technology in combat deduction based on soft actor-critic algorithm | |
Min WANG1  Xingzhong WANG1  Wei LUO1  | |
[1] China Ship Development and Design Center, Wuhan 430064, China; | |
关键词: combat deduction; independent decision making; deep reinforcement learning (drl); soft policy iteration; maximum entropy; | |
DOI : 10.19693/j.issn.1673-3185.02099 | |
来源: DOAJ |
【 摘 要 】
ObjectivesThe existing combat deduction simulation system mainly implements decision-making based on operational rules and experience knowledge, and it has certain problems such as limited application scenarios, low decision-making efficiency and poor flexibility. In view of the shortcomings of conventional decision-making methods, an intelligent decision-making model based on deep reinforcement learning (DRL) technology is proposed. MethodsFirst, the maximum entropy Markov decision process(MDP) of simulation deduction is established, and then the agent training network is constructed on the basis of actor-critic architecture to generate randomization policies that improve the agent's exploration ability. At the same time, the soft policy iterative updating method is used to search for better policies and continuously improve the agent's decision-making level. Finally, the simulation is carried out on the Mozi AI platform to validate the model. ResultsThe results show that an agent trained with the improved soft actor-critic (SAC) decision-making algorithm can achieve autonomous decision-making. Compared with the deep deterministic policy gradient (DDPG) algorithm, the probability of winning is increased by 24.53%. ConclusionsThe design scheme of this decision-making model can provide theoretical references for research on intelligent decision-making technology, giving it some reference significance for warfare simulation and deduction.
【 授权许可】
Unknown