We are working on the development of an adaptive learning framework addressing covariate shift, experienced in Behavioral Cloning (BC). BC user-modeling is a technique in which user-data, taken from observing a user’s navigations, is used to train a neural network classifier. The classifier learns to map the user’s actions to the state of the environment to reproduce their behavior in the future. The main challenge of this problem is the insufficiency of data which leads to under-performed models. The motivation of this research is to provide an adaptive framework that encourages users to demonstrate their unobserved strategies when they otherwise would not. This work takes advantage of the inefficiency of BC user-modeling. By training a Reinforcement Learning (RL) agent to compete against each user-model, it will learn to exploit the model’s inefficiencies. This RL agent provides a means to drive the user in the next learning step. While this technique is designed for BC user-modeling, this targeted form of data collection provides a solution to amassing comprehensive user-datasets. To facilitate these goals, we develop a testbed called Turn-Based Adversarial Game (TAG) which addresses key problems in alternative testbeds for user-modeling. With our adaptive framework applied to TAG, we show how we can drive human subjects to demonstrate new strategies, organically. We have tested our approach with one real human user and plan for another two real users. The findings show a significant change in their strategy at each step.
【 预 览 】
附件列表
Files
Size
Format
View
Modeling user behavior to construct counter strategies