Hill, Alexander (2022) Contextual Online Imitation Learning (COIL) : A Novel Method of Utilising Guide Policies in Reinforcement Learning. Master's Thesis / Essay, Mathematics.
|
Text
mMATH_2022_HillA.pdf Download (11MB) | Preview |
|
Text
Toestemming.pdf Restricted to Registered users only Download (99kB) |
Abstract
This paper proposes a novel method of utilising guide policies in Reinforcement Learning problems; Contextual Online Imitation Learning (COIL). This paper will demonstrate that COIL has the potential to solve Reinforcement Learning tasks better than both traditional Imitation Learning, and also Deep Reinforcement Learning algorithms such as Proximal Policy Optimisation (PPO). COIL can also effectively utilise non-expert guide policies, making it more flexible than current methods that integrate guide policies. This paper demonstrates that through using COIL, guide policies that achieve good performance in sub-tasks can also be used to help Reinforcement Learning agents looking to solve more complex tasks. This is a significant improvement in flexibility over traditional Imitation Learning methods. After discussing in depth some prerequisite knowledge in Reinforcement Learning and Imitation Learning, this paper will introduce the theory and motivation behind COIL, and will also test the effectiveness of COIL in a self-driving car simulation and real-life robot. In both applications, COIL gives stronger results than traditional Imitation Learning, Deep Reinforcement Learning, and the also the guide policy itself.
Item Type: | Thesis (Master's Thesis / Essay) |
---|---|
Supervisor name: | Grzegorczyk, M.A. and Carloni, R. |
Degree programme: | Mathematics |
Thesis type: | Master's Thesis / Essay |
Language: | English |
Date Deposited: | 10 Aug 2022 06:26 |
Last Modified: | 10 Aug 2022 06:26 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/28323 |
Actions (login required)
View Item |