Ferguson, Stan (2025) Exploring One-Step Fixed Horizon Q Learning in Tabular Stochastic Environments. Bachelor's Thesis, Artificial Intelligence.
|
Text
Submission-readyBscThesisFHQlearningStochasticity.pdf Download (1MB) | Preview |
|
|
Text
Toestemming.pdf Restricted to Registered users only Download (180kB) |
Abstract
In this study, we test Fixed Horizon Q-learning (FHQ) in tabular stochastic envi- ronments. FHQ was proposed as an alternative to regular (infinite horizon) Q learning by Asis et al. [2020] to break the deadly triad of reinforcement learning by countering bootstrapping to unreliable estimates. Instead of updating the value function with the entire episode return, a specific horizon length is set. We reproduce- and build on the previous research to better pinpoint what the (dis-)advantages of this method are, particularly on the performance of different horizon lengths and their computational cost over longer periods of time. We showed that, contrary to our initial assumption based on the previous research, the shorter horizons do not necessarily perform better in highly stochastic environments. We identify a trade-off between horizon length and computational cost, and find that α-decay is necessary for successful empirical convergence, which is not generally the case for most RL algorithms.
| Item Type: | Thesis (Bachelor's Thesis) |
|---|---|
| Supervisor name: | Fernandes Cunha, R. |
| Degree programme: | Artificial Intelligence |
| Thesis type: | Bachelor's Thesis |
| Language: | English |
| Date Deposited: | 28 Jul 2025 11:07 |
| Last Modified: | 28 Jul 2025 11:07 |
| URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/36511 |
Actions (login required)
![]() |
View Item |
