Accelerated task learning using Q-learning with reduced state space and reward with bootstrapping effect
DOI:
https://doi.org/10.5755/j01.itc.54.1.36362Keywords:
Interactive learning, Q-learning, cleaning scenario, task planning, reward patterns, bootstrappingAbstract
The influence of robots has been rapidly increasing in domestic scenarios. Robots learning in a self-supervised manner will be more efficient than programmed intelligence. In this paper, we present Q-learning-based task learning through interaction with the environment in a table-cleaning scenario. The environment consists of a table partitioned into two segments with a single object on it. The goal of the agent is to learn the sequence of tasks required to clean both segments of the table. Here, the state space is designed in such a way that its size is reduced to achieve better training time and success rate. Furthermore, four different rewards, denoted as r1, r2, r3 and r4, were allocated. The general reward allocation was based on the effect on the environment. The reward r4 is allocated in a novel way by relating two consecutive states to enhance the bootstrapping effect. Rewards r2 and r4 have improved the training time compared to r1 and r3. With reward allocation r2, the average reward starts converging around 290 iterations, and a success rate of approximately 84% is reached by 240 iterations. With reward allocation r4, the success rate reaches around 84% by 150 iterations.
Downloads
Published
Issue
Section
License
Copyright terms are indicated in the Republic of Lithuania Law on Copyright and Related Rights, Articles 4-37.