Q-learning adaptations in the game Othello

Krol, Daan and Brandenburg, Jeroen van (2020) Q-learning adaptations in the game Othello. Bachelor's Thesis, Artificial Intelligence.

Preview

Text
AI_BA_2020_Daan_Krol_and_Jeroen_van_Brandenburg.pdf
Download (375kB) | Preview

Text
Toestemming.pdf
Restricted to Registered users only
Download (94kB)

Abstract

Reinforcement learning algorithms are widely used algorithms concerning action selection to maximize the reward for a specific situation. Q-learning is such an algorithm. It estimates the quality of performing an action in a certain state. These estimations are continuously updated with each experience. In this paper we compare different adaptations to the Q-learning algorithm to learn an agent play the board game Othello. We discuss the use of a second estimator in Double Q-learning, the addition of a V-value function in QV- and QV2-learning, and we consider the on-policy variant of Q-learning called SARSA. A multilayer perceptron is used as a function approximator and is compared to the use of a convolutional neural network. Results indicate that SARSA, QV- and QV2-learning perform better than Q-learning. The addition of a second estimator in Double Q-learning does not seem to improve the performance of Q-learning. SARSA and QV2-learning converge slower and they struggle to escape local minima, while QV-learning converges faster. Results show that the multilayer perceptron outperforms the convolutional neural network.

Item Type:	Thesis (Bachelor's Thesis)
Supervisor name:	Wiering, M.A.
Degree programme:	Artificial Intelligence
Thesis type:	Bachelor's Thesis
Language:	English
Date Deposited:	10 Aug 2020 07:04
Last Modified:	10 Aug 2020 07:04
URI:	https://fse.studenttheses.ub.rug.nl/id/eprint/23036

Actions (login required)

View Item