IEEE Open Journal of Intelligent Transportation Systems | |
Designing Lookahead Policies for Sequential Decision Problems in Transportation and Logistics | |
Warren B. Powell1  | |
[1] Operations Research and Financial Engineering Department, Princeton University, Princeton, NJ, USA; | |
关键词: Direct lookahead approximations; model predictive control; parametric cost function approximation; policy search; reinforcement learning; sequential decisions; | |
DOI : 10.1109/OJITS.2022.3148574 | |
来源: DOAJ |
【 摘 要 】
There is a wide range of sequential decision problems in transportation and logistics that require dealing with uncertainty. There are four classes of policies that we can draw on for different types of decisions, but many problems in transportation and logistics will ultimately require some form of direct lookahead policy (DLA) where we optimize decisions over some horizon to make a decision now. The most common strategy is to use a deterministic lookahead (think Google maps), but what if you want to handle uncertainty? In this paper, we identify two major strategies for designing practical, implementable lookahead policies which handle uncertainty in fundamentally different ways. The first is a suitably parameterized deterministic lookahead, where the parameterization is tuned in a stochastic simulator. The second uses an approximate stochastic lookahead, where we identify six classes of approximations, one of which involves designing a “policy-within-a-policy,” for which we turn to all four classes of policies. We claim that our approximate lookahead model spans all the classical stochastic optimization tools for lookahead policies, while opening up pathways for new policies. But we also insist that the idea of a parameterized deterministic lookahead is a powerful new idea that offers features that, for some problems, can outperform the more familiar stochastic lookahead policies.
【 授权许可】
Unknown