Javascript must be enabled for the correct page display

Lyapunov Dual-Policy Control: A Physics-Informed Framework for Provably Safe Reinforcement Learning

Miró López-Feliu, Oscar (2025) Lyapunov Dual-Policy Control: A Physics-Informed Framework for Provably Safe Reinforcement Learning. Bachelor's Thesis, Artificial Intelligence.

[img]
Preview
Text
bAI2025MiroLopezFeliuO.pdf

Download (3MB) | Preview
[img] Text
akkoord lopez.pdf
Restricted to Registered users only

Download (191kB)

Abstract

Deep Reinforcement Learning (DRL) has achieved superhuman performance in diverse control tasks, yet its adoption in safety-critical physical systems remains limited by a lack of formal stability guarantees during training and inference. Classical controllers, while provably stable, are often conservative and model-bound. We introduce the Lyapunov Dual-Policy (LDP) controller, a framework that synthesizes three key concepts to bridge this gap: (i) A dual-policy structure, based on the work of Zoboli and Dibangoye, guarantees local asymptotic stability throughout training and deployment by blending a provably stable local Linear-Quadratic Regulator with a high-performance global DRL agent. (ii) The global policy is a physics-informed Lyapunov Actor-Critic (LAC), whose critic learns a maximal Lyapunov function by minimizing the residual of Zubov’s Partial Differential Equation, thereby maximizing the verifiable domain of attraction. (iii) A Counter-Example Guided Abstraction Refinement (CEGAR) loop uses formal verification to iteratively correct the learned stability certificate and stabilize the training process. Experiments on a nonlinear inverted-pendulum benchmark show that the LDP framework achieves a 100% convergence rate, yielding a significantly higher reward, and faster convergence than competing classical control and DRL baselines.

Item Type: Thesis (Bachelor's Thesis)
Supervisor name: Cardenas Cartagena, J. D.
Degree programme: Artificial Intelligence
Thesis type: Bachelor's Thesis
Language: English
Date Deposited: 22 Aug 2025 07:59
Last Modified: 22 Aug 2025 07:59
URI: https://fse.studenttheses.ub.rug.nl/id/eprint/36807

Actions (login required)

View Item View Item