Lukas Frieß: Model-based Reinforcement Learning with First-Principle Models (FAU Erlangen-Nürnberg, 2021)

Motivation

Reinforcement learning (RL) is increasingly used in robotics to learn complex tasks from repeated interactions with the environment. For example, a mobile robot can learn to avoid an obstacle by testing a large number of randomized motion strategies. The success is measured by a reward function that should be maximized in the course of the iterations. Algorithms for reinforcement learning are commonly divided in model-free and model-based approaches, where the former ones directly learn a policy and the latter ones use the interactions to build a model. In contrast, a model predictive controller solves a dynamic optimization problem in each time step to determine the optimal control strategy. Although reinforcement learning and model predictive control can be used to solve the same tasks, the two approaches are rarely compared directly.

Task definition

In a collaboration between the Chair of Automatic Control and the Machine Learning and Information Fusion Group of Fraunhofer IIS, research is performed into possible combinations of reinforcement learning and real-time nonlinear model predictive control. The goal of this master thesis is to implement a model-based reinforcement learning algorithm using methodologies commonly used in model predictive control for dynamical systems such as, for example, adjoint- based sensitivity analysis. In order to compare the approach to existing algorithms, one of the examples in the freely available benchmark suite by Tingu Wang et al. can be used.

Requirements: Basic knowledge of control theory and model predictive control, programming experience in MATLAB, C and Python of advantage.

Supervisors:

See also the description at the Chair of Automatic Control @ FAU

References

  1. Tingwu Wang et al., Benchmarking Model-Based Reinforcement Learning, arXiv:1907.02057