学位论文详细信息
Noncooperative static and dynamic games: addressing shared constraints and phase transitions
mean-field game;oscillators;synchronization of oscillators;static and dynamic game;phase transition;flow control;mean-field approximation;optimal control;Nonlinear system;Hamilton-Jacobi-Bellman (HJB) equation;Fokker-Planck-Kolmogorov (FPK) equation;Approximate Dynamic Programming (ADP);Nash equilibrium;active queue management
Yin, Huibing
关键词: mean-field game;    oscillators;    synchronization of oscillators;    static and dynamic game;    phase transition;    flow control;    mean-field approximation;    optimal control;    Nonlinear system;    Hamilton-Jacobi-Bellman (HJB) equation;    Fokker-Planck-Kolmogorov (FPK) equation;    Approximate Dynamic Programming (ADP);    Nash equilibrium;    active queue management;   
Others  :  https://www.ideals.illinois.edu/bitstream/handle/2142/30922/Yin_Huibing.pdf?sequence=1&isAllowed=y
美国|英语
来源: The Illinois Digital Environment for Access to Learning and Scholarship
PDF
【 摘 要 】
Compared to linear systems, nonlinear generalizations may exhibit both non-equilibrium and equilibriumbehavior in the long run. The characterization of such behavior is challenging, particularly when overlaid byan optimization or control layer, and is of relevance in a range of applications, e.g., neuroscience, biology,economics, communication networks and power systems. The objective of this thesis is to consider thesequestions for two prototypical applications of nonlinear multi-agent systems: (1) large population of coupledoscillators and (2) communication networks. The research is divided into the following three parts:Synchronization of oscillators: The purpose of this part is to understand phase transition in noncoop-erative dynamic games with a large number of agents. The focus of analysis is on a variation of the largepopulation linear quadratic Gaussian (LQG) model proposed by Huang et. al. 2007 [1], comprised hereof a controlled N-dimensional stochastic differential equation model, coupled only through a cost function.The states are interpreted as phase angles for a collection of non-homogeneous oscillators, and in this waythe model may be regarded as an extension of the classical coupled oscillator model of Kuramoto.A deterministic PDE model is proposed, which is shown to approximate the stochastic system as thepopulation size approaches infinity. Key to the analysis of the PDE model is the existence of a particularNash equilibrium in which the agents ‘opt out’ of the game, setting their controls to zero, resulting in the‘incoherence’ equilibrium.Next we introduce approximate dynamic programming (ADP) techniques for the design and adaptation(learning) of approximately optimal control laws for this model. For this purpose, a parameterization isproposed, based on analysis of the mean-field PDE model for the game. In an offline setting, a Galerkinprocedure is introduced to choose the optimal parameters. In an online setting, a steepest descent algorithmis proposed. We provide detailed analysis of the optimal parameter values as well as the Bellman error withboth the Galerkin approximation and the online algorithm.Methods from dynamical systems theory are used in a bifurcation analysis, based on a linearization ofthe PDE model about the incoherence equilibrium. A critical value of the control cost parameter is identified:Above this value, the oscillators are incoherent; and below this value (when control is sufficiently cheap)the oscillators synchronize. Then we simplify the analysis by relating the solutions of the PDE model to thesolutions of a certain nonlinear eigenvalue problem. Both analysis and computation are significantly easierfor the nonlinear eigenvalue problem. Apart from the bifurcation analysis that shows existence of a phasetransition, we also describe a Lyapunov-Schmidt perturbation method to obtain asymptotic formulae for thesmall amplitude bifurcated solutions.A key question in the design of engineered competitive systems has been that of the efficiency of theassociated equilibria. Yet, there is little known in this regard in the context of stochastic dynamic games in alarge population regime. Here, we examine the efficiency of the associated mean-field equilibria with respectto a related welfare optimization problem. We construct variational problems both for the noncooperativegame and its centralized counterpart and employ these problems as a vehicle for conducting this analysis.Using a bifurcation analysis, we analyze the variational solutions and the associated efficiency loss. Anexpression for the local bound of efficiency loss is obtained for the homogeneous population.All the conclusions are illustrated with results from numerical experiments.Nash games with coupled strategy sets: Generalized Nash equilibria (GNE) represent extensions of theNash solution concept when the strategy sets are coupled across agents. We consider a restricted class ofsuch games, referred to as generalized Nash games, in which the agents contend with shared or commonconstraints and their payoff functions are further linked via a scaled congestion cost metric. When strategysets are continuous and the metric is an increasing convex function, a solution to a related variational in-equality provides a set of equilibria characterized by common Lagrange multipliers for shared constraints.In general, this variational inequality problem is non-monotone. However, we show that under mild con-ditions, it admits solutions, even in the absence of restrictive compactness assumptions on strategy sets.Additionally, we show that the equilibrium is locally unique both in the primal space as well as in the largerprimal-dual space. The existence statements can be generalized to accommodate a piecewise-smooth metricwhile affine restrictions, surprisingly, lead to both existence and uniqueness guarantees. The second half ofthe part provides a brief discussion of distributed computation of such equilibria in monotone regimes viaa distributed iterative Tikhonov regularization (ITR) scheme. Notably, such schemes are single-timescalecounterparts of standard Tikhonov regularization methods and involve updating the regularization parameterafter every gradient step. Application of such techniques to a class of network flow rate allocation gamessuggests that the ITR schemes perform better than their two-timescale counterparts.Nonlinear network flow control with AQM feedback: The last part of this thesis investigates stability,bifurcation and oscillations arising in a communication network model with a large number of heteroge-neous users adopting a Transmission Control Protocol (TCP)-like rate control scheme with an Active QueueManagement (AQM) router. The heterogeneity in the system is due to different user delays that are knownand fixed but taken from a given distribution. It is shown that for any given distribution of delays, thereexists a critical amount of feedback (due to AQM) at which the equilibrium loses stability and a limit cy-cling solution develops via a Hopf bifurcation. The nature (criticality) of the bifurcation is investigated withthe aid of Lyapunov-Schmidt perturbation method. The results of the analysis are numerically verified andprovide valuable insights into dynamics of the AQM control system.
【 预 览 】
附件列表
Files Size Format View
Noncooperative static and dynamic games: addressing shared constraints and phase transitions 2752KB PDF download
  文献评价指标  
  下载次数:7次 浏览次数:31次