This paper focuses on the application of deep meta-reinforcement learning to the robust design of low-thrust interplanetary trajectories in presence of severe dynamic and control uncertainties. A closed-loop control policy is used to steer the spacecraft to a final target state despite the considered perturbations. The control policy is approximated by a deep recurrent neural network, trained by policy-gradient reinforcement learning on a collection of environments featuring mixed sources of uncertainty, namely dynamic uncertainty and control execution errors. The objective is to build an internal representation of the environment distribution that adapts to the different stochastic scenarios. The results in terms of optimality, constraint handling, and robustness obtained on a three-dimensional low-thrust transfer between Earth and Mars are compared with those returned by a traditional fully-connected network.
Robust Design of Interplanetary Trajectories under Severe Uncertainty via Meta-Reinforcement Learning / Federici, L.; Zavoli, A.. - 2022-September:(2022). (Intervento presentato al convegno 73rd International Astronautical Congress, IAC 2022 tenutosi a Paris; France).
Robust Design of Interplanetary Trajectories under Severe Uncertainty via Meta-Reinforcement Learning
Federici L.;Zavoli A.
2022
Abstract
This paper focuses on the application of deep meta-reinforcement learning to the robust design of low-thrust interplanetary trajectories in presence of severe dynamic and control uncertainties. A closed-loop control policy is used to steer the spacecraft to a final target state despite the considered perturbations. The control policy is approximated by a deep recurrent neural network, trained by policy-gradient reinforcement learning on a collection of environments featuring mixed sources of uncertainty, namely dynamic uncertainty and control execution errors. The objective is to build an internal representation of the environment distribution that adapts to the different stochastic scenarios. The results in terms of optimality, constraint handling, and robustness obtained on a three-dimensional low-thrust transfer between Earth and Mars are compared with those returned by a traditional fully-connected network.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.