This paper presents a meta-reinforcement learning approach to the robust and autonomous waypoint guidance of a six-rotor unmanned aerial vehicle in Mars' atmosphere. The meta-learning is implemented by using a recurrent neural network as a control policy to map data about the hexacopter state provided by onboard sensors to the six rotor angular speeds. The network is trained by proximal policy optimization, a state-of-the-art policy gradient reinforcement learning algorithm. During the training, the network is also provided with information about the previous control output and reward, to improve the policy adaptability to different environment instances. Several mission scenarios, involving uncertainties on Mars' atmosphere's properties, the presence of random wind gusts, and Gaussian noise on the collected sensor data, are investigated to assess the robustness of the proposed approach in realistic operative conditions. The flexibility and performance of meta-reinforcement learning are also compared against standard reinforcement learning with a fully-connected neural network, to better highlight the potential of the proposed methodology in real-world autonomous guidance applications.

Robust Waypoint Guidance of a Hexacopter on Mars using Meta-Reinforcement Learning / Federici, Lorenzo; Furfaro, Roberto; Zavoli, Alessandro; De Matteis, Guido. - (2023). (Intervento presentato al convegno AIAA SciTech Forum and Exposition, 2023 tenutosi a National Harbor, MD (USA)) [10.2514/6.2023-2663].

Robust Waypoint Guidance of a Hexacopter on Mars using Meta-Reinforcement Learning

Federici, Lorenzo;Zavoli, Alessandro;De Matteis, Guido
2023

Abstract

This paper presents a meta-reinforcement learning approach to the robust and autonomous waypoint guidance of a six-rotor unmanned aerial vehicle in Mars' atmosphere. The meta-learning is implemented by using a recurrent neural network as a control policy to map data about the hexacopter state provided by onboard sensors to the six rotor angular speeds. The network is trained by proximal policy optimization, a state-of-the-art policy gradient reinforcement learning algorithm. During the training, the network is also provided with information about the previous control output and reward, to improve the policy adaptability to different environment instances. Several mission scenarios, involving uncertainties on Mars' atmosphere's properties, the presence of random wind gusts, and Gaussian noise on the collected sensor data, are investigated to assess the robustness of the proposed approach in realistic operative conditions. The flexibility and performance of meta-reinforcement learning are also compared against standard reinforcement learning with a fully-connected neural network, to better highlight the potential of the proposed methodology in real-world autonomous guidance applications.
2023
AIAA SciTech Forum and Exposition, 2023
meta-reinforcement learning; hexacopter; robust guidance and control
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Robust Waypoint Guidance of a Hexacopter on Mars using Meta-Reinforcement Learning / Federici, Lorenzo; Furfaro, Roberto; Zavoli, Alessandro; De Matteis, Guido. - (2023). (Intervento presentato al convegno AIAA SciTech Forum and Exposition, 2023 tenutosi a National Harbor, MD (USA)) [10.2514/6.2023-2663].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1714832
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact