Majeris, Mantas (2022) Extending the QV family of deep reinforcement learning algorithms: DQV2 and DQV-Max2. Bachelor's Thesis, Artificial Intelligence.
|
Text
bAI_2022_MajerisM.pdf Download (455kB) | Preview |
|
Text
Toestemming.pdf Restricted to Registered users only Download (120kB) |
Abstract
This paper attempts to extend the QV family of Deep Reinforcement learning algorithms, where both the state value function, and the state-action value function are approximated using neural networks. We introduce two new algorithms, DQV2 and DQV-Max2, based on the classical RL algorithms QV2 and QV-Max2. We run multiple experiments in the Cart-Pole, Acrobot, and Mountain-Car environments, where we compare the reward obtained over time by the two algorithms DQV2 and DQV-Max2, as well as several established QV family algorithms, specifically DQV, DQV-Max, and an algorithm which only learns the state-action value function DQN. Preliminary results suggest that DQV-Max2 is comparable to DQV-Max in performance, while DQV2 vastly underperforms with most of the hyperparameter combinations used in this study. However, for some hyperparameter combinations, DQV2 and DQV-Max2 achieve comparable performance to DQV and DQV-Max with greater consistency.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Supervisor name: | Sabatelli, M. |
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 22 Jul 2022 10:50 |
Last Modified: | 22 Jul 2022 10:50 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/28109 |
Actions (login required)
View Item |