This paper introduces a novel algorithm for Reinforcement Learning (RL) in Regular Decision Processes (RDPs), a model of non-Markovian decision processes where dynamics and rewards depend on regular properties of the history. Our algorithm is inspired by Monte Carlo tree search (MCTS), yet it is improved with state merging capabilities. Performing merges allows us to evolve the tree model into a graph over time, as we periodically perform similarity tests borrowed from automata learning theory to learn states that are equivalent to one another. This results in improved efficiency and scalability over standard MCTS. We present empirical results that demonstrate orders of magnitude performance improvement over the state-of-the-art RL algorithms for RDPs.
Monte Carlo Tree Search with State Merging for Reinforcement Learning in Regular Decision Processes / Licks, G. P.; Patrizi, F.; De Giacomo, G.. - 392:(2024), pp. 1373-1380. ( 27th European Conference on Artificial Intelligence, ECAI 2024 Santiago de Compostela; Spain ) [10.3233/FAIA240637].
Monte Carlo Tree Search with State Merging for Reinforcement Learning in Regular Decision Processes
Licks G. P.;Patrizi F.;De Giacomo G.
2024
Abstract
This paper introduces a novel algorithm for Reinforcement Learning (RL) in Regular Decision Processes (RDPs), a model of non-Markovian decision processes where dynamics and rewards depend on regular properties of the history. Our algorithm is inspired by Monte Carlo tree search (MCTS), yet it is improved with state merging capabilities. Performing merges allows us to evolve the tree model into a graph over time, as we periodically perform similarity tests borrowed from automata learning theory to learn states that are equivalent to one another. This results in improved efficiency and scalability over standard MCTS. We present empirical results that demonstrate orders of magnitude performance improvement over the state-of-the-art RL algorithms for RDPs.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


