One of the main challenges in AI is performing dynamic tasks by using approaches that efficiently predict the environment’s future outcomes. State-of-the-art planners can reason effectively with symbolic representations of the environment. However, when the environment is continuous and unstructured, manually extracting an ad-hoc symbolic model to perform planning may be infeasible. Deep Reinforcement Learning is known to automatically learn compact representations of the state space through interaction with the environment. However, it is not suitable for planning, giving up the efficiency we would gain by predicting the consequences of actions. This work focuses on continuous state-space MDPs and proposes an approach that naturally combines interaction, symbolic representation learning, and symbolic online planning. Our system leverages experience-data gained from the environment to autonomously learn a symbolic planning model composed of: (1) a symbol grounding model to switch from continuous to symbolic space and vice versa; (2) a symbolic transition model; (3) a value function for symbolic states. This model is used at training time to lead the interaction with the world. At each interaction step, we perform fast symbolic online planning over a finite horizon to choose the action to execute in the environment. The success of this strategy in the environment implicitly validates our automatically extracted symbolic model, since the system is able to effectively plan actions in the original MDP by reasoning only in the finite and symbolic domain. The approach has been evaluated on several continual OpenAI gym environments, addressing successfully both control problems and games.

Learning a Symbolic Planning Domain through the Interaction with Continuous Environments / Umili, Elena; Antonioni, Emanuele; Riccio, Francesco; Capobianco, Roberto; Nardi, Daniele; DE GIACOMO, Giuseppe. - (2021). (Intervento presentato al convegno The International Conference on Automated Planning, ICAPS 2021 tenutosi a Guangzhou, China).

Learning a Symbolic Planning Domain through the Interaction with Continuous Environments

Elena Umili
;
Emanuele Antonioni
;
Francesco Riccio;Roberto Capobianco;Daniele Nardi;Giuseppe De Giacomo
2021

Abstract

One of the main challenges in AI is performing dynamic tasks by using approaches that efficiently predict the environment’s future outcomes. State-of-the-art planners can reason effectively with symbolic representations of the environment. However, when the environment is continuous and unstructured, manually extracting an ad-hoc symbolic model to perform planning may be infeasible. Deep Reinforcement Learning is known to automatically learn compact representations of the state space through interaction with the environment. However, it is not suitable for planning, giving up the efficiency we would gain by predicting the consequences of actions. This work focuses on continuous state-space MDPs and proposes an approach that naturally combines interaction, symbolic representation learning, and symbolic online planning. Our system leverages experience-data gained from the environment to autonomously learn a symbolic planning model composed of: (1) a symbol grounding model to switch from continuous to symbolic space and vice versa; (2) a symbolic transition model; (3) a value function for symbolic states. This model is used at training time to lead the interaction with the world. At each interaction step, we perform fast symbolic online planning over a finite horizon to choose the action to execute in the environment. The success of this strategy in the environment implicitly validates our automatically extracted symbolic model, since the system is able to effectively plan actions in the original MDP by reasoning only in the finite and symbolic domain. The approach has been evaluated on several continual OpenAI gym environments, addressing successfully both control problems and games.
2021
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1581213
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact