This paper investigates the use of deep learning techniques for real-time optimal spacecraft guidance during terminal rendezvous maneuvers, in presence of both operational constraints and stochastic effects, such as an inaccurate knowledge of the initial spacecraft state and the presence of random in-flight disturbances. The performance of two well-studied deep learning methods, behavioral cloning (BC) and reinforcement learning (RL), is investigated on a linear multi-impulsive rendezvous mission. To this aim, a multilayer perceptron network, with custom architecture, is designed to map any observation of the actual spacecraft relative position and velocity to the propellant-optimal control action, which corresponds to a bounded-magnitude impulsive velocity variation. In the BC approach, the deep neural network is trained by supervised learning on a set of optimal trajectories, generated by routinely solving the deterministic optimal control problem via convex optimization, starting from scattered initial conditions. Conversely, in the RL approach, a state-of-the-art actor–critic algorithm, proximal policy optimization, is used for training the network through repeated interactions with the stochastic environment. Eventually, the robustness and propellant efficiency of the obtained closed-loop control policies are assessed and compared by means of a Monte Carlo analysis, carried out by considering different test cases with increasing levels of perturbations.
Deep learning techniques for autonomous spacecraft guidance during proximity operations / Federici, L.; Benedikter, B.; Zavoli, A.. - In: JOURNAL OF SPACECRAFT AND ROCKETS. - ISSN 0022-4650. - 58:6(2021), pp. 1774-1785. [10.2514/1.A35076]
Deep learning techniques for autonomous spacecraft guidance during proximity operations
Federici L.
Primo
;Benedikter B.Secondo
;Zavoli A.Ultimo
2021
Abstract
This paper investigates the use of deep learning techniques for real-time optimal spacecraft guidance during terminal rendezvous maneuvers, in presence of both operational constraints and stochastic effects, such as an inaccurate knowledge of the initial spacecraft state and the presence of random in-flight disturbances. The performance of two well-studied deep learning methods, behavioral cloning (BC) and reinforcement learning (RL), is investigated on a linear multi-impulsive rendezvous mission. To this aim, a multilayer perceptron network, with custom architecture, is designed to map any observation of the actual spacecraft relative position and velocity to the propellant-optimal control action, which corresponds to a bounded-magnitude impulsive velocity variation. In the BC approach, the deep neural network is trained by supervised learning on a set of optimal trajectories, generated by routinely solving the deterministic optimal control problem via convex optimization, starting from scattered initial conditions. Conversely, in the RL approach, a state-of-the-art actor–critic algorithm, proximal policy optimization, is used for training the network through repeated interactions with the stochastic environment. Eventually, the robustness and propellant efficiency of the obtained closed-loop control policies are assessed and compared by means of a Monte Carlo analysis, carried out by considering different test cases with increasing levels of perturbations.File | Dimensione | Formato | |
---|---|---|---|
federici_deep-learning-techniques_2021.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
3.17 MB
Formato
Adobe PDF
|
3.17 MB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.