Tomashpolskyi, V. (2017) Sparse Negative Feedback for Cooperative Inverse Reinforcement Learning. Master's Thesis / Essay, Artificial Intelligence.
|
Text
Master_AI_research_project_201_1.pdf - Published Version Download (4MB) | Preview |
|
Text
Akkoord.pdf - Other Restricted to Registered users only Download (44kB) |
Abstract
With the current situation in reinforcement learning field, a lot of attention has been devoted to solving different kinds of problems applying deep learning methods to them, due to the ability to approximate complex functions and concepts. One prominent application of reinforcement learning lies in a cooperative inverse reinforcement learning(CIRL) subfield, aimed at agents discovering and adjusting to the reward function of a human controlling them. It is important for the agents to be able to learn transparently in such a setting. In this thesis we train several well-established reinforcement learning algorithms - A3C, ANQL, DDQN, DQN and A3C RNN - on a model devised to represent a CIRL task with the help of negative feedback in both POMDP and MDP settings. We investigate the impact of regularization and historical data on their performance, as well as influence of reward function decomposition on the learning process. We also test the performance of the aforementioned reinforcement learning algorithms on the CIRL problem. We find that A3C RNN algorithm is capable of reliably learning an expected policy, without falling into a trap of producing an textit{optimal value policy}. We also find that DDQN algorithm is capable of learning complex policies with the help of regularization. This research is aimed to uncover further work prospects and investigate the performance of currently available algorithms on the CIRL negative feedback based tasks.
Item Type: | Thesis (Master's Thesis / Essay) |
---|---|
Degree programme: | Artificial Intelligence |
Thesis type: | Master's Thesis / Essay |
Language: | English |
Date Deposited: | 15 Feb 2018 08:33 |
Last Modified: | 15 Feb 2018 08:33 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/16225 |
Actions (login required)
View Item |