Javascript must be enabled for the correct page display

Region enhanced neural Q-learning in partially observable Markov decision processes

Kooi, T (2010) Region enhanced neural Q-learning in partially observable Markov decision processes. Bachelor's Thesis, Artificial Intelligence.

BachelorThesis_Tkooi.pdf - Published Version

Download (1MB) | Preview


To get a robot to perform tasks autonomously, the robot has to plan its behavior and make decisions based on the input it receives. Unfortunately, contemporary robot sensors and actuators are subject to noise, rendering optimal decision making a stochastic process. To model this process, partially observable Markov decision processes (POMDPs) can be applied. In this paper we introduce the RENQ algorithm, a new POMDP algorithm that combines neural networks for estimating Q-values with the construction of a spatial pyramid over the state space. RENQ essentially uses region-based belief vectors together with state-based belief vectors, and these are used as inputs to the neural network trained with Q-learning. We compare RENQ to Qmdp and Perseus, two state-of-the-art algorithms for approximately solving model-based POMDPs. The results on three different maze navigation tasks indicate that RENQ outperforms Perseus on all problems and outperforms Qmdp if the problem becomes larger.

Item Type: Thesis (Bachelor's Thesis)
Degree programme: Artificial Intelligence
Thesis type: Bachelor's Thesis
Language: English
Date Deposited: 15 Feb 2018 07:31
Last Modified: 15 Feb 2018 07:31

Actions (login required)

View Item View Item