In this paper, a meta-reinforcement learning approach is used to generate a guidance algorithm capable of carrying out multi-target missions. Specifically, two models are trained to learn how to realize multiple fuel-optimal low-thrust rendezvous maneuvers between circular co-planar orbits with close radii. The first model is entirely based on a Multilayer Perceptron (MLP) neural network, while the second one also relies on a Long Short-Term Memory (LSTM) layer, which provides augmented generalization capability by incorporating memory-dependent internal states. The two networks are trained via Proximal Policy Optimization (PPO) on a wide distribution of transfers, which encompasses all possible trajectories connecting any pair of targets of a given set, and in a given time window. The aim is to produce a nearly-optimal guidance law that could be directly used for any transfer leg of the actual multi-target mission. To assess the validity of the proposed approach, a sensitivity analysis on a single leg is carried out by varying the radius either of the initial or the final orbit, the transfer time, and the initial phase angle between the chaser and the target. The results show that the LSTM-equipped network is able to better reconstruct the optimal control in almost all the analyzed scenarios, and, at the same time, to achieve, in average, a lower value of the terminal constraint violation.

Meta-reinforcement learning for adaptive spacecraft guidance during multi-target missions / Federici, L.; Scorsoglio, A.; Zavoli, A.; Furfaro, R.. - C1:(2021). (Intervento presentato al convegno IAF Astrodynamics symposium 2021 at the 72nd International astronautical congress, IAC 2021 tenutosi a Dubai; UAE).

Meta-reinforcement learning for adaptive spacecraft guidance during multi-target missions

Federici L.
Primo
;
Zavoli A.;
2021

Abstract

In this paper, a meta-reinforcement learning approach is used to generate a guidance algorithm capable of carrying out multi-target missions. Specifically, two models are trained to learn how to realize multiple fuel-optimal low-thrust rendezvous maneuvers between circular co-planar orbits with close radii. The first model is entirely based on a Multilayer Perceptron (MLP) neural network, while the second one also relies on a Long Short-Term Memory (LSTM) layer, which provides augmented generalization capability by incorporating memory-dependent internal states. The two networks are trained via Proximal Policy Optimization (PPO) on a wide distribution of transfers, which encompasses all possible trajectories connecting any pair of targets of a given set, and in a given time window. The aim is to produce a nearly-optimal guidance law that could be directly used for any transfer leg of the actual multi-target mission. To assess the validity of the proposed approach, a sensitivity analysis on a single leg is carried out by varying the radius either of the initial or the final orbit, the transfer time, and the initial phase angle between the chaser and the target. The results show that the LSTM-equipped network is able to better reconstruct the optimal control in almost all the analyzed scenarios, and, at the same time, to achieve, in average, a lower value of the terminal constraint violation.
2021
IAF Astrodynamics symposium 2021 at the 72nd International astronautical congress, IAC 2021
multi-target mission; autonomous spacecraft guidance; meta-reinforcement learning; recurrent neural network; proximal policy optimization; optimal control
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Meta-reinforcement learning for adaptive spacecraft guidance during multi-target missions / Federici, L.; Scorsoglio, A.; Zavoli, A.; Furfaro, R.. - C1:(2021). (Intervento presentato al convegno IAF Astrodynamics symposium 2021 at the 72nd International astronautical congress, IAC 2021 tenutosi a Dubai; UAE).
File allegati a questo prodotto
File Dimensione Formato  
Federici_Meta-Reinforcement_2021.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.58 MB
Formato Adobe PDF
1.58 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1640430
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? ND
social impact