Episodic Control with Drift Compensation

LeKander, Michael (2019) Episodic Control with Drift Compensation. Master's Thesis / Essay, Computing Science.

Preview

Text
mCS_2019_LeKanderM.pdf
Download (14MB) | Preview

Text
toestemming.pdf
Restricted to Registered users only
Download (120kB)

Abstract

The ability to learn to act in complex interactive environments is a vital component of human intelligence. Reinforcement Learning is a rapidly growing area of research which attempts to produce agents which interact in an environment (e.g. Atari games) to maximize reward (e.g. the final game score). While "deep" approaches have been highly successful in this domain, they have the drawback of requiring millions of frames of experience in order to learn. Model Free Episodic Control is a recently proposed algorithm that addresses this issue by using nearest-neighbors regression. This algorithm has the desirable property of “immediate one-shot learning”, allowing it to quickly latch onto successful strategies. In this research, we make three primary additions to the existing work. First, we devise an efficient online approximate nearest neighbors algorithm, which is highly important for the overall efficiency of the algorithm. Second, we explore using a wider spectrum of local regression techniques, of which nearest-neighbors regression is just a single example. Finally, we explore the use of explicit drift compensation, to account for changes in the underlying return function. We ultimately produce the reinforcement learning algorithm termed Episodic Control with Drift Compensation. Through a series of experiments on a suite of five classic Atari 2600 games, we demonstrate that this novel algorithm makes improvements above the state-of-the-art, particularly in expanding the long-term capacity of the agents.

Item Type:	Thesis (Master's Thesis / Essay)
Supervisor name:	Biehl, M. and Wiering, M.A.
Degree programme:	Computing Science
Thesis type:	Master's Thesis / Essay
Language:	English
Date Deposited:	30 Jan 2019
Last Modified:	01 Feb 2019 11:10
URI:	https://fse.studenttheses.ub.rug.nl/id/eprint/19079

Actions (login required)

View Item