Miró López-Feliu, Oscar (2025) Lyapunov Dual-Policy Control: A Physics-Informed Framework for Provably Safe Reinforcement Learning. Bachelor's Thesis, Artificial Intelligence.
|
Text
bAI2025MiroLopezFeliuO.pdf Download (3MB) | Preview |
|
|
Text
akkoord lopez.pdf Restricted to Registered users only Download (191kB) |
Abstract
Deep Reinforcement Learning (DRL) has achieved superhuman performance in diverse control tasks, yet its adoption in safety-critical physical systems remains limited by a lack of formal stability guarantees during training and inference. Classical controllers, while provably stable, are often conservative and model-bound. We introduce the Lyapunov Dual-Policy (LDP) controller, a framework that synthesizes three key concepts to bridge this gap: (i) A dual-policy structure, based on the work of Zoboli and Dibangoye, guarantees local asymptotic stability throughout training and deployment by blending a provably stable local Linear-Quadratic Regulator with a high-performance global DRL agent. (ii) The global policy is a physics-informed Lyapunov Actor-Critic (LAC), whose critic learns a maximal Lyapunov function by minimizing the residual of Zubov’s Partial Differential Equation, thereby maximizing the verifiable domain of attraction. (iii) A Counter-Example Guided Abstraction Refinement (CEGAR) loop uses formal verification to iteratively correct the learned stability certificate and stabilize the training process. Experiments on a nonlinear inverted-pendulum benchmark show that the LDP framework achieves a 100% convergence rate, yielding a significantly higher reward, and faster convergence than competing classical control and DRL baselines.
| Item Type: | Thesis (Bachelor's Thesis) |
|---|---|
| Supervisor name: | Cardenas Cartagena, J. D. |
| Degree programme: | Artificial Intelligence |
| Thesis type: | Bachelor's Thesis |
| Language: | English |
| Date Deposited: | 22 Aug 2025 07:59 |
| Last Modified: | 22 Aug 2025 07:59 |
| URI: | https://fse.studenttheses.ub.rug.nl/id/eprint/36807 |
Actions (login required)
![]() |
View Item |
