Pronk, Remco (2018) Hierarchical reinforcement learning in multiplayer Bomberman. Bachelor's Thesis, Artificial Intelligence.
|
Text
bAI_2018_PronkR.pdf Download (371kB) | Preview |
Abstract
Experiments have been conducted to compare winrates of an agent obtained with hierarchical reinforcement learning and flat reinforcement learning on the multiplayer mode of the videogame Bomberman. The performance between a single network, two networks and four networks have been compared. Four bombermen are placed together in an arena. This arena contains walls, which are always placed in the same location at the start of each game. Bombermen and some of these walls can be destroyed by exploding bombs. These bombs are placeable by the bombermen. Multilayer perceptrons are used to approximate the utility of each state-action pair used in Q-learning. The exploration strategies used are Error-Driven-epsilon, which is a variant on Diminishing epsilon-Greedy, and Max-Boltzmann. Each trial consisted of a hundred generations of 10,000 training games with 100 test games. A significant difference in results has been found with both exploration strategies. The winrate during the first twenty generations is higher for both the two networks and four networks experiments. After this, the winrate becomes lower in comparison to flat reinforcement learning.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Supervisor name: | Wiering, M.A. |
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 09 Mar 2018 |
Last Modified: | 12 Mar 2018 11:55 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/16546 |
Actions (login required)
View Item |