学位论文详细信息
Inverse optimal control for deterministic continuous-time nonlinear systems
optimal control;inverse reinforcement learning;inverse optimal control;apprenticeship learning;Learning from demonstration;iterative learning control
Johnson, Miles
关键词: optimal control;    inverse reinforcement learning;    inverse optimal control;    apprenticeship learning;    Learning from demonstration;    iterative learning control;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/46747/Miles_Johnson.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】

Inverse optimal control is the problem of computing a cost function with respect to which observed state input trajectories are optimal. We present a new method of inverse optimal control based on minimizing the extent to which observed trajectories violate first-order necessary conditions for optimality. We consider continuous-time deterministic optimal control systems with a cost function that is a linear combination of known basis functions. We compare our approach with three prior methods of inverse optimal control. We demonstrate the performance of these methods by performing simulation experiments using a collection of nominal system models. We compare the robustness of these methods by analyzing how they perform under perturbations to the system. We consider two scenarios: one in which we exactly know the set of basis functions in the cost function, and another in which the true cost function contains an unknown perturbation. Results from simulation experiments show that our new method is computationally efficient relative to prior methods, performs similarly to prior approaches under large perturbations to the system, and better learns the true cost function under small perturbations. We then apply our method to three problems of interest in robotics. First, we apply inverse optimal control to learn the physical properties of an elastic rod. Second, we apply inverse optimal control to learn models of human walking paths. These models of human locomotion enable automation of mobile robots moving in a shared space with humans, and enable motion prediction of walking humans given partial trajectory observations. Finally, we apply inverse optimal control to develop a new method of learning from demonstration for quadrotor dynamic maneuvering. We compare and contrast our method with an existing state-of-the-art solution based on minimum-time optimal control, and show that our method can generalize to novel tasks and reject environmental disturbances.

【 预 览 】
附件列表
Files Size Format View
Inverse optimal control for deterministic continuous-time nonlinear systems 9801KB PDF download
  文献评价指标  
  下载次数:3次 浏览次数:7次