Javascript must be enabled for the correct page display

Low-Latency Language-Action Foundation Models via Upside-Down RL

Sawala, Lukasz (2025) Low-Latency Language-Action Foundation Models via Upside-Down RL. Bachelor's Thesis, Artificial Intelligence.

[img]
Preview
Text
BachelorThesisLukasz-5-1.pdf

Download (2MB) | Preview
[img] Text
toestemming sawala.pdf
Restricted to Registered users only

Download (179kB)

Abstract

This paper explores the Upside-Down Reinforcement Learning (UDRL) algorithm, an offline RL paradigm, introducing novel transformer-based architectures to create a scalable and controllable framework for efficient low-resource command-conditioned behavior in complex state-action spaces. Two architectures are proposed: UDRLt and UDRLt-MLP, both leverag- ing lightweight transformers for efficient control in continuous action spaces. Results show that UDRLt-MLP significantly outperforms the Decision Transformer baseline and achieves higher alignment with desired outcomes, even under out-of-distribution commands, while requiring only a fraction of computational resources. In more challenging transfer settings like AntMaze, fine- tuning and iterative self-improvement via rollout-based imitation partially recover performance, though limitations in dataset quality persist. A self-imitation algorithm is proposed to mitigate data scarcity issues. The findings highlight UDRL’s potential as a foundation for scalable and aligned control systems while identifying issues and future research directions.

Item Type: Thesis (Bachelor's Thesis)
Supervisor name: Cardenas Cartagena, J. D. and Sabatelli, M.
Degree programme: Artificial Intelligence
Thesis type: Bachelor's Thesis
Language: English
Date Deposited: 20 Aug 2025 09:26
Last Modified: 20 Aug 2025 09:26
URI: https://fse.studenttheses.ub.rug.nl/id/eprint/36798

Actions (login required)

View Item View Item