Exploring One-Step Fixed Horizon Q Learning in Tabular Stochastic Environments

Ferguson, Stan (2025) Exploring One-Step Fixed Horizon Q Learning in Tabular Stochastic Environments. Bachelor's Thesis, Artificial Intelligence.

Preview

Text
Submission-readyBscThesisFHQlearningStochasticity.pdf
Download (1MB) | Preview

Text
Toestemming.pdf
Restricted to Registered users only
Download (180kB)

Abstract

In this study, we test Fixed Horizon Q-learning (FHQ) in tabular stochastic envi- ronments. FHQ was proposed as an alternative to regular (infinite horizon) Q learning by Asis et al. [2020] to break the deadly triad of reinforcement learning by countering bootstrapping to unreliable estimates. Instead of updating the value function with the entire episode return, a specific horizon length is set. We reproduce- and build on the previous research to better pinpoint what the (dis-)advantages of this method are, particularly on the performance of different horizon lengths and their computational cost over longer periods of time. We showed that, contrary to our initial assumption based on the previous research, the shorter horizons do not necessarily perform better in highly stochastic environments. We identify a trade-off between horizon length and computational cost, and find that α-decay is necessary for successful empirical convergence, which is not generally the case for most RL algorithms.

Item Type:	Thesis (Bachelor's Thesis)
Supervisor name:	Fernandes Cunha, R.
Degree programme:	Artificial Intelligence
Thesis type:	Bachelor's Thesis
Language:	English
Date Deposited:	28 Jul 2025 11:07
Last Modified:	28 Jul 2025 11:07
URI:	https://fse.studenttheses.ub.rug.nl/id/eprint/36511

Actions (login required)

View Item