Javascript must be enabled for the correct page display

Episodic Control with Drift Compensation

LeKander, Michael (2019) Episodic Control with Drift Compensation. Master's Thesis / Essay, Computing Science.


Download (14MB) | Preview
[img] Text
Restricted to Registered users only

Download (120kB)


The ability to learn to act in complex interactive environments is a vital component of human intelligence. Reinforcement Learning is a rapidly growing area of research which attempts to produce agents which interact in an environment (e.g. Atari games) to maximize reward (e.g. the final game score). While "deep" approaches have been highly successful in this domain, they have the drawback of requiring millions of frames of experience in order to learn. Model Free Episodic Control is a recently proposed algorithm that addresses this issue by using nearest-neighbors regression. This algorithm has the desirable property of “immediate one-shot learning”, allowing it to quickly latch onto successful strategies. In this research, we make three primary additions to the existing work. First, we devise an efficient online approximate nearest neighbors algorithm, which is highly important for the overall efficiency of the algorithm. Second, we explore using a wider spectrum of local regression techniques, of which nearest-neighbors regression is just a single example. Finally, we explore the use of explicit drift compensation, to account for changes in the underlying return function. We ultimately produce the reinforcement learning algorithm termed Episodic Control with Drift Compensation. Through a series of experiments on a suite of five classic Atari 2600 games, we demonstrate that this novel algorithm makes improvements above the state-of-the-art, particularly in expanding the long-term capacity of the agents.

Item Type: Thesis (Master's Thesis / Essay)
Supervisor nameSupervisor E mail
Degree programme: Computing Science
Thesis type: Master's Thesis / Essay
Language: English
Date Deposited: 30 Jan 2019
Last Modified: 01 Feb 2019 11:10

Actions (login required)

View Item View Item