Convergence results for an averaged LQR problem with applications to reinforcement learning

Pesare, A.; Palladino, M.; Falcone, M.

doi:10.1007/s00498-021-00294-y

In this paper, we will deal with a linear quadratic optimal control problem with unknown dynamics. As a modeling assumption, we will suppose that the knowledge that an agent has on the current system is represented by a probability distribution π on the space of matrices. Furthermore, we will assume that such a probability measure is opportunely updated to take into account the increased experience that the agent obtains while exploring the environment, approximating with increasing accuracy the underlying dynamics. Under these assumptions, we will show that the optimal control obtained by solving the “average” linear quadratic optimal control problem with respect to a certain π converges to the optimal control driven related to the linear quadratic optimal control problem governed by the actual, underlying dynamics. This approach is closely related to model-based reinforcement learning algorithms where prior and posterior probability distributions describing the knowledge on the uncertain system are recursively updated. In the last section, we will show a numerical test that confirms the theoretical results.

Convergence results for an averaged LQR problem with applications to reinforcement learning / Pesare, A.; Palladino, M.; Falcone, M.. - In: MATHEMATICS OF CONTROL SIGNALS AND SYSTEMS. - ISSN 0932-4194. - 33:3(2021), pp. 379-411. [10.1007/s00498-021-00294-y]

Convergence results for an averaged LQR problem with applications to reinforcement learning

Pesare A.;Palladino M.^Secondo;Falcone M.^Ultimo

2021

Abstract

In this paper, we will deal with a linear quadratic optimal control problem with unknown dynamics. As a modeling assumption, we will suppose that the knowledge that an agent has on the current system is represented by a probability distribution π on the space of matrices. Furthermore, we will assume that such a probability measure is opportunely updated to take into account the increased experience that the agent obtains while exploring the environment, approximating with increasing accuracy the underlying dynamics. Under these assumptions, we will show that the optimal control obtained by solving the “average” linear quadratic optimal control problem with respect to a certain π converges to the optimal control driven related to the linear quadratic optimal control problem governed by the actual, underlying dynamics. This approach is closely related to model-based reinforcement learning algorithms where prior and posterior probability distributions describing the knowledge on the uncertain system are recursively updated. In the last section, we will show a numerical test that confirms the theoretical results.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2021
			
	Parole chiave
	
				Averaged control; convergence; linear quadratic regulator; model-based RL; optimal control; reinforcement learning
			
	Tipologia
	
				01 Pubblicazione su rivista::01a Articolo in rivista
			
	Citazione
	
				Convergence results for an averaged LQR problem with applications to reinforcement learning / Pesare, A.; Palladino, M.; Falcone, M.. - In: MATHEMATICS OF CONTROL SIGNALS AND SYSTEMS. - ISSN 0932-4194. - 33:3(2021), pp. 379-411. [10.1007/s00498-021-00294-y]
			
	Appartiene alla tipologia:
	
				01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Pesare_Convergence_2021.pdf accesso aperto Note: https://link.springer.com/content/pdf/10.1007/s00498-021-00294-y.pdf Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.1 MB Formato Adobe PDF	1.1 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1604568

Citazioni

ND

5

4

Catalogo dei prodotti della ricerca