Kooi, T (2010) Region enhanced neural Q-learning in partially observable Markov decision processes. Bachelor's Thesis, Artificial Intelligence.
|
Text
BachelorThesis_Tkooi.pdf - Published Version Download (1MB) | Preview |
Abstract
To get a robot to perform tasks autonomously, the robot has to plan its behavior and make decisions based on the input it receives. Unfortunately, contemporary robot sensors and actuators are subject to noise, rendering optimal decision making a stochastic process. To model this process, partially observable Markov decision processes (POMDPs) can be applied. In this paper we introduce the RENQ algorithm, a new POMDP algorithm that combines neural networks for estimating Q-values with the construction of a spatial pyramid over the state space. RENQ essentially uses region-based belief vectors together with state-based belief vectors, and these are used as inputs to the neural network trained with Q-learning. We compare RENQ to Qmdp and Perseus, two state-of-the-art algorithms for approximately solving model-based POMDPs. The results on three different maze navigation tasks indicate that RENQ outperforms Perseus on all problems and outperforms Qmdp if the problem becomes larger.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 15 Feb 2018 07:31 |
Last Modified: | 15 Feb 2018 07:31 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/9133 |
Actions (login required)
View Item |