Milenov, Viktor (2024) Incorporating a Distance Metric to Induce Safe Behavior in Super Mario Bros using Deep Reinforcement Learning. Bachelor's Thesis, Artificial Intelligence.
|
Text
bAI2024ViktorM.pdf Download (1MB) | Preview |
|
Text
Milenov - ja.pdf Restricted to Registered users only Download (178kB) |
Abstract
Safe Reinforcement Learning (Safe RL) is a sub-branch of machine learning and data-driven algorithms that strives to guarantee safe performance in a system while optimizing its performance efficiency. To experiment with such algorithms and models we use video games, specifically Super Mario Bros, which offer environments where faulty behavior does not lead to serious real life consequences, allowing for the modification and improvement of such algorithms and models. We employ the Actor-Critic method because of its effective and stable training procedure, which permits learning a value function and a policy at the same time and directly modifies the gradient descent direction to guarantee system safety. Our model consists of an actor network and an ensemble of critic networks to obtain a more accurate value function approximation. Furthermore, we incorporate a distance metric that stands for the distance between our agent and the closest danger in the environment. We compare the performance of this safer model with a baseline model to assess if the distance metric indirectly induces safe behavior of the agent in the environment. We are investigating ways to improve an agent’s training to make it more ”safe”.
Item Type: | Thesis (Bachelor's Thesis) |
---|---|
Supervisor name: | Cardenas Cartagena, J. D. |
Degree programme: | Artificial Intelligence |
Thesis type: | Bachelor's Thesis |
Language: | English |
Date Deposited: | 03 May 2024 09:08 |
Last Modified: | 03 May 2024 11:04 |
URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/32365 |
Actions (login required)
View Item |