Using Reinforcement Learning to Make Optimal Use of Available Power and Improving Overall Speed of a Solar-Powered Boat

Feldbrugge, R.L. (2010) Using Reinforcement Learning to Make Optimal Use of Available Power and Improving Overall Speed of a Solar-Powered Boat. Master's Thesis / Essay, Artificial Intelligence.

Preview

Text
AI-MAI-2010-R.L.Feldbrugge.pdf - Published Version
Download (1MB) | Preview

Abstract

In preparation of the Frisian Solar Challenge 2010 (the world cup for solar powered boats), the Hanzehogeschool and RuG are designing a hydrofoiling solarboat. Hydrofoils are under water wings, providing lift, capable of raising a boat out of the water. This reduces drag significantly, enabling higher speeds with a decrease of power consumption. However a lot of energy has to be spent in order to get the boat out of the water. These large amounts of energy were previously not available through solar power, making hydrofoils state of the art in solar powered boats. The hydrofoils of the solarboat are retractable, meaning that during the race the solarboat has the ability to switch between states (sailing hullborne or foilborne). Deciding when to switch is a difficult task which has to take into account many different variables. We show how Reinforcement learning can be used to learn an optimal policy in a simulative environment which aims at finishing the race as fast as possible with a limited amount of energy. We use Artificial Neural Networks as function approximators for estimating the value for each action in an arbitrary state. The learned policy will have an advisory role to the pilot of the boat during the race. We set up several experiments, iteratively increasing the complexity of the model. First the partially observable Mountain car problem was modeled, our results show that through the use of artificial neural networks the value can be predicted more accurately than with a traditional tabular approach. This resulted in a significantly better performance than with the standard method. Next we modeled the solar boat in a simple race situation, competing on a linear track without an environment. We show how our algorithm is able to optimize the time required for the solar boat to reach the finish line. Finally we modeled the solar boat within his environment, taking into account position on the track, cornering, sun energy, changing weather and shadows. Two different policies were trained, the first was trained on one single track (Specific Policy Algorithm, SPA), the other was trained on all tracks (Generalizing Policy Algorithm, GPA). There was no significant difference between the performance of the SPA and the GPA. Both algorithms show a gradual decrease in time required to reach the goal, optimizing energy in a situation where energy is limited and unpredictable.

Item Type:	Thesis (Master's Thesis / Essay)
Degree programme:	Artificial Intelligence
Thesis type:	Master's Thesis / Essay
Language:	English
Date Deposited:	15 Feb 2018 07:44
Last Modified:	15 Feb 2018 07:44
URI:	https://fse.studenttheses.ub.rug.nl/id/eprint/9347

Actions (login required)

View Item