Arapu, Diana-Maria (2024) Sparse Rewards Reinforcement Learning: Addressing Vanishing Intrinsic Rewards in Change-Based Exploration Transfer. Bachelor's Thesis, Artificial Intelligence.
|
Text
bAI2024ArapuDM.pdf Download (1MB) | Preview |
|
Text
toestemming.pdf Restricted to Registered users only Download (156kB) |
Abstract
Exploration is one of the main challenges in sparse rewards reinforcement learning. The Change-Based Exploration Transfer (C-Bet) method leverages intrinsic motivation to learn an exploration policy that mitigates the lack of external rewards and guides exploration. The agent receives a higher intrinsic reward when visiting less frequented areas and while performing actions that change the environment. However, because these intrinsic rewards are solely relied upon to learn the exploration policy, they diminish over time and eventually vanish. This study proposes a novel approach to address this issue by introducing a time-variant component to the C-Bet intrinsic reward. The modified algorithm was evaluated in various MiniGrid environments, procedurally-generated gridworlds that present exploration challenges due to sparse rewards. The results indicate that the modified algorithm improves exploration in some environments compared to the original C-Bet algorithm. However, its performance depends on the structure of the environment, showing limited generalization. Nonetheless, this improvement highlights the potential for more effective exploration strategies that rely on intrinsic motivation.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Supervisor name: | Fernandes Cunha, R. |
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 07 Aug 2024 09:28 |
Last Modified: | 07 Aug 2024 09:28 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/33859 |
Actions (login required)
View Item |