Pentaliotis, Andreas (2020) Investigating Overestimation Bias in Reinforcement Learning. Master's Thesis / Essay, Artificial Intelligence.
|
Text
mAI_2020_PentaliotisA.pdf Download (3MB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (93kB) |
Abstract
Overestimation bias is an inherent property of reinforcement learning algorithms that approximate maximum expected values by maximizing uncertain estimates. Since overestimation bias was identified in the literature, it has been generally considered to have a negative effect on reinforcement learning algorithms. In this thesis we investigate overestimation bias by examining Q-learning and conclude that overestimation bias may have either a negative or positive effect on reinforcement learning algorithms depending on the reinforcement learning problem. Based on this conclusion, we propose a new variant of Q-learning, called Variation-resistant Q-learning, to control and utilize estimation bias for better performance. We present the tabular version of Variation-resistant Q-learning, prove a convergence theorem for the algorithm in the tabular case, and extend the algorithm to a function approximation solution method. Additionally, we present empirical results from three different experiments, in which we compared the performance of Variation-resistant Q-learning, Q-learning, and Double Q-learning. The empirical results verify that Variation-resistant Q-learning can control and utilize estimation bias for better performance in the experimental tasks.
Item Type: | Thesis (Master's Thesis / Essay) |
---|---|
Supervisor name: | Wiering, M.A. |
Degree programme: | Artificial Intelligence |
Thesis type: | Master's Thesis / Essay |
Language: | English |
Date Deposited: | 23 Jun 2020 11:27 |
Last Modified: | 23 Jun 2020 11:27 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/22173 |
Actions (login required)
View Item |