Reinforcement Learning (RL) faces challenges in transferring knowledge across tasks efficiently. In this work we focus on transferring policies between different temporally extended tasks expressed in Linear Temporal Logic. Existing methodologies either rely on task-specific representations or train deep-neural-networks-based task-state representations exploiting the interaction with the environment. We propose a novel approach, leveraging semantic similarity of the formulas, to compute transferable task state representations directly from task specifications, offline, and without any learning process. Preliminary experiments on temporally-extended navigation in a grid world domain demonstrate the superiority of our semantic representation over baseline methods. This approach lays the groundwork for lightweight, transferable task state representations based solely on task semantics, offering a promising avenue for efficient RL in temporally-extended tasks without extensive retraining.
Transfer Learning between non-Markovian RL Tasks through Semantic Representations of Temporal States / Fanti, Andrea; Umili, Elena; Capobianco, Roberto. - (2024). (Intervento presentato al convegno 1st International Workshop on Adjustable Autonomy and Physical Embodied Intelligence (AAPEI) tenutosi a Santiago de Compostela, Spain).
Transfer Learning between non-Markovian RL Tasks through Semantic Representations of Temporal States
Andrea Fanti;Elena Umili;Roberto Capobianco
2024
Abstract
Reinforcement Learning (RL) faces challenges in transferring knowledge across tasks efficiently. In this work we focus on transferring policies between different temporally extended tasks expressed in Linear Temporal Logic. Existing methodologies either rely on task-specific representations or train deep-neural-networks-based task-state representations exploiting the interaction with the environment. We propose a novel approach, leveraging semantic similarity of the formulas, to compute transferable task state representations directly from task specifications, offline, and without any learning process. Preliminary experiments on temporally-extended navigation in a grid world domain demonstrate the superiority of our semantic representation over baseline methods. This approach lays the groundwork for lightweight, transferable task state representations based solely on task semantics, offering a promising avenue for efficient RL in temporally-extended tasks without extensive retraining.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.