Catalogo dei prodotti della ricerca

Optimal control and Reinforcement Learning deal both with sequential decision-making problems, although they use different tools. In this thesis, we have investigated the connection between these two research areas. In particular, our contributions are twofold. In the first part of the thesis, we present and study an optimal control problem with uncertain dynamics. As a modeling assumption, we will suppose that the knowledge that an agent has on the current system is represented by a probability distribution π on the space of possible dynamics functions. The goal is to minimize an average cost functional, where the average is computed with respect to the probability distribution π. This framework describes well the behavior of a class of model-based RL algorithms, which build a probabilistic model (here represented by π) of the dynamics, and then compute the control by minimizing the expectation of the cost functional with respect to π. In this context, we establish some convergence results for the value function and the optimal control. These results constitute an important step in the convergence analysis of this class of RL algorithms. In the second part, we propose a new online algorithm for dealing with LQR problems where the state matrix A is unknown. Our algorithm provides an approximation of the dynamics and finds a suitable control at the same time, during a single simulation. It is based on an integration between RL and optimal control techniques. A probabilistic model is updated at each iteration using Bayesian linear regression formulas, and the control is obtained in feedback form by solving a Riccati differential equation. Numerical tests show how the algorithm can efficiently bring the system to the origin, despite not having full knowledge of the system at the beginning of the simulation.

An optimal control approach to Reinforcement Learning / Pesare, Andrea. - (2022 Jan 25).

An optimal control approach to Reinforcement Learning

PESARE, ANDREA

25/01/2022

Abstract

Optimal control and Reinforcement Learning deal both with sequential decision-making problems, although they use different tools. In this thesis, we have investigated the connection between these two research areas. In particular, our contributions are twofold. In the first part of the thesis, we present and study an optimal control problem with uncertain dynamics. As a modeling assumption, we will suppose that the knowledge that an agent has on the current system is represented by a probability distribution π on the space of possible dynamics functions. The goal is to minimize an average cost functional, where the average is computed with respect to the probability distribution π. This framework describes well the behavior of a class of model-based RL algorithms, which build a probabilistic model (here represented by π) of the dynamics, and then compute the control by minimizing the expectation of the cost functional with respect to π. In this context, we establish some convergence results for the value function and the optimal control. These results constitute an important step in the convergence analysis of this class of RL algorithms. In the second part, we propose a new online algorithm for dealing with LQR problems where the state matrix A is unknown. Our algorithm provides an approximation of the dynamics and finds a suitable control at the same time, during a single simulation. It is based on an integration between RL and optimal control techniques. A probabilistic model is updated at each iteration using Bayesian linear regression formulas, and the control is obtained in feedback form by solving a Riccati differential equation. Numerical tests show how the algorithm can efficiently bring the system to the origin, despite not having full knowledge of the system at the beginning of the simulation.

Scheda breve

Scheda completa

Data di discussione

25-gen-2022

Appartiene alla tipologia:

07a Tesi di Dottorato

File allegati a questo prodotto

File	Dimensione	Formato
Tesi_dottorato_Pesare.pdf accesso aperto Tipologia: Tesi di dottorato Licenza: Creative commons Dimensione 2.49 MB Formato Adobe PDF	2.49 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1634795

Citazioni

ND

ND

ND

social impact