Heeres, K.J. (2020) QVA-learning: Testing a Novel Reinforcement Learning Algorithm Using Other-Play in the Helenix Environment. Bachelor's Thesis, Artificial Intelligence.
|
Text
AI_BA_2020_JobHeeres.pdf Download (3MB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (96kB) |
Abstract
In this paper we use the Helenix environment and other-play, to test a new reinforcement learning algorithm called QVA-learning. This algorithm builds upon QV-learning by adding a function which should improve the learning behaviour. Within the game of Helenix, 5 different algorithms will train against each other using other-play. We found that when comparing the final models, Q-learning performed best under these conditions and that QVA-learning managed to outperform both double Q-learning and its predecessor QV-learning. However, when looking at the entire learning process, only SARSA manages to outperform QVA-learning. We conclude that QVA-learning shows potential and improves upon its predecessor QV-learning.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Supervisor name: | Wiering, M.A. |
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 02 Sep 2020 07:15 |
Last Modified: | 02 Sep 2020 07:15 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/23344 |
Actions (login required)
View Item |