Learning a Transferable World Model by Reinforcement Agent in Deterministic Observable Grid-World Environments
Keywords:Agent, adaptive behavior, world model, learning, planning, decision tree, grid world, reinforcement, percept generalization
AbstractReinforcement-based agents have difficulties in transferring their acquired knowledge into new different environments due to the common identities-based percept representation and the lack of appropriate generalization capabilities. In this paper, the problem of knowledge transferability is addressed by proposing an agent dotted with decision tree induction and constructive induction capabilities and relying on decomposable properties-based percept representation. The agent starts without any prior knowledge of its environment and of the effects of its actions. It learns a world model (the set of decision trees) that corresponds to the set of explicit action definitions predicting action effects in terms of agent’s percepts. Agent’s planning component uses predictions of the world model to chain actions via a breadth-first search. The proposed agent was compared to the Q-learning and Adaptive Dynamic Programming based agents and demonstrated better ability to achieve goals in static observable deterministic grid-world environments different from those in which it has learnt its world model.
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.