Visual Reinforcement Learning is a popular and powerful framework that fully leverages recent breakthroughs in Deep Learning. However, variations in input domains (e.g., changes in background colors due to seasonal shifts) or task domains (e.g., modifying a car’s target speed) can degrade agent performance, often requiring retraining for each variation. Recent advances in representation learning have demonstrated the potential to combine components from different neural networks to construct new models in a zero-shot fashion. In this dissertation, we build upon these advances and adapt them to the Visual Reinforcement Learning setting, enabling the composition of agent components to form new agents capable of handling novel visual-task combinations not seen during training. This is achieved by establishing communication between encoders and controllers from different models trained under distinct variations. Our findings highlight the promise of model reuse, significantly reducing the need for retraining and thereby cutting down on both time and computational cost.

Latent alignment techniques enable modular policies in the context of reinforcement learning / Ricciardi, Antonio Pio. - (2025 May 22).

Latent alignment techniques enable modular policies in the context of reinforcement learning

RICCIARDI, ANTONIO PIO
22/05/2025

Abstract

Visual Reinforcement Learning is a popular and powerful framework that fully leverages recent breakthroughs in Deep Learning. However, variations in input domains (e.g., changes in background colors due to seasonal shifts) or task domains (e.g., modifying a car’s target speed) can degrade agent performance, often requiring retraining for each variation. Recent advances in representation learning have demonstrated the potential to combine components from different neural networks to construct new models in a zero-shot fashion. In this dissertation, we build upon these advances and adapt them to the Visual Reinforcement Learning setting, enabling the composition of agent components to form new agents capable of handling novel visual-task combinations not seen during training. This is achieved by establishing communication between encoders and controllers from different models trained under distinct variations. Our findings highlight the promise of model reuse, significantly reducing the need for retraining and thereby cutting down on both time and computational cost.
22-mag-2025
File allegati a questo prodotto
File Dimensione Formato  
Tesi_dottorato_Ricciardi.pdf

accesso aperto

Note: tesi completa
Tipologia: Tesi di dottorato
Licenza: Creative commons
Dimensione 10.7 MB
Formato Adobe PDF
10.7 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1740489
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact