Comparison of Exploration Methods for Connectionist Reinforcement Learning in the game Bomberman

Groot Kormelink, J.G.C (2017) Comparison of Exploration Methods for Connectionist Reinforcement Learning in the game Bomberman. Bachelor's Thesis, Artificial Intelligence.

Preview

Text
AI_BA_2017_JosephGrootKormelink.pdf - Published Version
Download (1MB) | Preview

Text
Toestemming.pdf - Other
Restricted to Backend only
Download (80kB)

Abstract

In this thesis, we investigate which exploration method yields the best performance in the game Bomberman. In Bomberman the controlled agent has to kill opponents by placing bombs. The agent is represented by a multi-layer perceptron that learns to play the game with the use of Q-learning. The learning capabilities of the exploration methods: RandomWalk, Greedy, E-Greedy, Diminishing E-Greedy, Error-Driven, Max-Boltzmann and TD-Error will be compared. Bomberman is represented in a deterministic framework and the agents have been built on top of this framework. The results show that Max-Boltzmann exploration performs the best with a win rate of 88%, which is 2% higher than the second best method, Diminishing E-Greedy. Furthermore, Max-Boltzmann gathers on average 30 points more than Diminishing E-Greedy. Error-Driven exploration outperforms all other exploration methods over the first 80 generations, however this technique also produced unstable behavior.

Item Type:	Thesis (Bachelor's Thesis)
Degree programme:	Artificial Intelligence
Thesis type:	Bachelor's Thesis
Language:	English
Date Deposited:	15 Feb 2018 08:30
Last Modified:	15 Feb 2018 08:30
URI:	https://fse.studenttheses.ub.rug.nl/id/eprint/15450

Actions (login required)

View Item