LeKander, Michael (2019) Episodic Control with Drift Compensation. Master's Thesis / Essay, Computing Science.
|
Text
mCS_2019_LeKanderM.pdf Download (14MB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (120kB) |
Abstract
The ability to learn to act in complex interactive environments is a vital component of human intelligence. Reinforcement Learning is a rapidly growing area of research which attempts to produce agents which interact in an environment (e.g. Atari games) to maximize reward (e.g. the final game score). While "deep" approaches have been highly successful in this domain, they have the drawback of requiring millions of frames of experience in order to learn. Model Free Episodic Control is a recently proposed algorithm that addresses this issue by using nearest-neighbors regression. This algorithm has the desirable property of “immediate one-shot learning”, allowing it to quickly latch onto successful strategies. In this research, we make three primary additions to the existing work. First, we devise an efficient online approximate nearest neighbors algorithm, which is highly important for the overall efficiency of the algorithm. Second, we explore using a wider spectrum of local regression techniques, of which nearest-neighbors regression is just a single example. Finally, we explore the use of explicit drift compensation, to account for changes in the underlying return function. We ultimately produce the reinforcement learning algorithm termed Episodic Control with Drift Compensation. Through a series of experiments on a suite of five classic Atari 2600 games, we demonstrate that this novel algorithm makes improvements above the state-of-the-art, particularly in expanding the long-term capacity of the agents.
Item Type: | Thesis (Master's Thesis / Essay) |
---|---|
Supervisor name: | Biehl, M. and Wiering, M.A. |
Degree programme: | Computing Science |
Thesis type: | Master's Thesis / Essay |
Language: | English |
Date Deposited: | 30 Jan 2019 |
Last Modified: | 01 Feb 2019 11:10 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/19079 |
Actions (login required)
View Item |