In this paper, a meta-reinforcement learning approach is investigated to design an adaptive guidance algorithm capable of carrying out multiple rendezvous space missions. Specifically, both a standard fully-connected network and a recurrent neural network are trained by proximal policy optimization on a wide distribution of finite-thrust rendezvous transfers between circular co-planar orbits. The recurrent network is also provided with the control and reward at the previous simulation step, thus allowing it to build, thanks to its history-dependent state, an internal representation of the considered task distribution. The ultimate goal is to generate a model which could adapt to unseen tasks and produce a nearly-optimal guidance law along any transfer leg of a multi-target mission. As a first step towards the solution of a complete multi-target problem, a sensitivity analysis on the single rendezvous leg is carried out in this paper, by varying the radius either of the initial or the final orbit, the transfer time, and the initial phasing between the chaser and the target. Numerical results show that the recurrent-network-based meta-reinforcement learning approach is able to better reconstruct the optimal control in almost all the analyzed scenarios, and, at the same time, to meet, with greater accuracy, the terminal rendezvous condition, even when considering problem instances that fall outside the original training domain.

Meta-reinforcement learning for adaptive spacecraft guidance during finite-thrust rendezvous missions / Federici, Lorenzo; Scorsoglio, Andrea; Zavoli, Alessandro; Furfaro, Roberto. - In: ACTA ASTRONAUTICA. - ISSN 0094-5765. - 201:(2022), pp. 129-141. [10.1016/j.actaastro.2022.08.047]

Meta-reinforcement learning for adaptive spacecraft guidance during finite-thrust rendezvous missions

Federici, Lorenzo
;
Zavoli, Alessandro;
2022

Abstract

In this paper, a meta-reinforcement learning approach is investigated to design an adaptive guidance algorithm capable of carrying out multiple rendezvous space missions. Specifically, both a standard fully-connected network and a recurrent neural network are trained by proximal policy optimization on a wide distribution of finite-thrust rendezvous transfers between circular co-planar orbits. The recurrent network is also provided with the control and reward at the previous simulation step, thus allowing it to build, thanks to its history-dependent state, an internal representation of the considered task distribution. The ultimate goal is to generate a model which could adapt to unseen tasks and produce a nearly-optimal guidance law along any transfer leg of a multi-target mission. As a first step towards the solution of a complete multi-target problem, a sensitivity analysis on the single rendezvous leg is carried out in this paper, by varying the radius either of the initial or the final orbit, the transfer time, and the initial phasing between the chaser and the target. Numerical results show that the recurrent-network-based meta-reinforcement learning approach is able to better reconstruct the optimal control in almost all the analyzed scenarios, and, at the same time, to meet, with greater accuracy, the terminal rendezvous condition, even when considering problem instances that fall outside the original training domain.
2022
rendezvous mission; autonomous spacecraft guidance; meta-reinforcement learning; recurrent neural network; proximal policy optimization; optimal control
01 Pubblicazione su rivista::01a Articolo in rivista
Meta-reinforcement learning for adaptive spacecraft guidance during finite-thrust rendezvous missions / Federici, Lorenzo; Scorsoglio, Andrea; Zavoli, Alessandro; Furfaro, Roberto. - In: ACTA ASTRONAUTICA. - ISSN 0094-5765. - 201:(2022), pp. 129-141. [10.1016/j.actaastro.2022.08.047]
File allegati a questo prodotto
File Dimensione Formato  
Federici_postprint_Meta_2024.pdf

accesso aperto

Note: Publisher version: https://doi.org/10.1016/j.actaastro.2022.08.047
Tipologia: Documento in Pre-print (manoscritto inviato all'editore, precedente alla peer review)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.71 MB
Formato Adobe PDF
2.71 MB Adobe PDF
Federici_Meta_2022.pdf

solo gestori archivio

Note: https://www.sciencedirect.com/science/article/pii/S009457652200460X?via=ihub
Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 3.12 MB
Formato Adobe PDF
3.12 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1713991
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 28
  • ???jsp.display-item.citation.isi??? 21
social impact