Investigating Overestimation Bias in Reinforcement Learning

Pentaliotis, Andreas (2020) Investigating Overestimation Bias in Reinforcement Learning. Master's Thesis / Essay, Artificial Intelligence.

Preview

Text
mAI_2020_PentaliotisA.pdf
Download (3MB) | Preview

Text
toestemming.pdf
Restricted to Registered users only
Download (93kB)

Abstract

Overestimation bias is an inherent property of reinforcement learning algorithms that approximate maximum expected values by maximizing uncertain estimates. Since overestimation bias was identified in the literature, it has been generally considered to have a negative effect on reinforcement learning algorithms. In this thesis we investigate overestimation bias by examining Q-learning and conclude that overestimation bias may have either a negative or positive effect on reinforcement learning algorithms depending on the reinforcement learning problem. Based on this conclusion, we propose a new variant of Q-learning, called Variation-resistant Q-learning, to control and utilize estimation bias for better performance. We present the tabular version of Variation-resistant Q-learning, prove a convergence theorem for the algorithm in the tabular case, and extend the algorithm to a function approximation solution method. Additionally, we present empirical results from three different experiments, in which we compared the performance of Variation-resistant Q-learning, Q-learning, and Double Q-learning. The empirical results verify that Variation-resistant Q-learning can control and utilize estimation bias for better performance in the experimental tasks.

Item Type:	Thesis (Master's Thesis / Essay)
Supervisor name:	Wiering, M.A.
Degree programme:	Artificial Intelligence
Thesis type:	Master's Thesis / Essay
Language:	English
Date Deposited:	23 Jun 2020 11:27
Last Modified:	23 Jun 2020 11:27
URI:	https://fse.studenttheses.ub.rug.nl/id/eprint/22173

Actions (login required)

View Item