Many agent-environment interactions can be framed as dynamical systemsin which agents take actions and receive observations.Thesedynamical systems are diverse, representing such things as a bipedwalking, a stock price changing over time, the trajectory of amissile, or the shifting fish population in a lake.Often, interacting successfully with the environment requires the useof a model, which allows the agent to predict something about thefuture by summarizing the past.Two of the basic problems in modelingpartially observable dynamical systems are selecting a representationof state and selecting a mechanism for maintaining that state.Thisthesis explores both problems from a learning perspective: we areinterested in learning a predictive model directly from the data thatarises as an agent interacts with its environment.This thesis develops models for dynamical systems which representstate as a set of statistics about the short-term future, as opposedto treating state as a latent, unobservable quantity.In other words,the agent summarizes the past into predictions about the short-termfuture, which allow the agent to make further predictions about theinfinite future.Because all parameters in the model are definedusing only observable quantities, the learning algorithms for suchmodels are often straightforward and have attractive theoreticalproperties.We examine in depth the case where state is representedas the parameters of an exponential family distribution over ashort-term window of future observations.We unify a number ofdifferent existing models under this umbrella, and predict and analyzenew models derived from the generalization.One goal of this research is to push models with predictively definedstate towards real-world applications.We contribute models andcompanion learning algorithms for domains with partial observability,continuous observations, structured observations, high-dimensionalobservations, and/or continuous actions.Our models successfullycapture standard POMDPs and benchmark nonlinear timeseries problemswith performance comparable to state-of-the-art models.They alsoallow us to perform well on novel domains which are larger than thosecaptured by other models with predictively defined state, includingtraffic prediction problems and domains analogous to autonomous mobilerobots with camera sensors.
【 预 览 】
附件列表
Files
Size
Format
View
Exponential Family Predictive Representations of State.