Groot Kormelink, J.G.C (2017) Comparison of Exploration Methods for Connectionist Reinforcement Learning in the game Bomberman. Bachelor's Thesis, Artificial Intelligence.
|
Text
AI_BA_2017_JosephGrootKormelink.pdf - Published Version Download (1MB) | Preview |
|
Text
Toestemming.pdf - Other Restricted to Backend only Download (80kB) |
Abstract
In this thesis, we investigate which exploration method yields the best performance in the game Bomberman. In Bomberman the controlled agent has to kill opponents by placing bombs. The agent is represented by a multi-layer perceptron that learns to play the game with the use of Q-learning. The learning capabilities of the exploration methods: RandomWalk, Greedy, E-Greedy, Diminishing E-Greedy, Error-Driven, Max-Boltzmann and TD-Error will be compared. Bomberman is represented in a deterministic framework and the agents have been built on top of this framework. The results show that Max-Boltzmann exploration performs the best with a win rate of 88%, which is 2% higher than the second best method, Diminishing E-Greedy. Furthermore, Max-Boltzmann gathers on average 30 points more than Diminishing E-Greedy. Error-Driven exploration outperforms all other exploration methods over the first 80 generations, however this technique also produced unstable behavior.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 15 Feb 2018 08:30 |
Last Modified: | 15 Feb 2018 08:30 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/15450 |
Actions (login required)
View Item |