Several recent approaches in reinforcement learning are studying a conceptual architecture where the environment is simultaneously represented at two (or more) levels of abstraction, with the environment providing two traces of data/events/features/fluents, one at a lower-level/finer grain and one at a higher-level/coarser grain. For simplicity, most of this literature assumes that the instants of the two traces match. In this paper, we drop this strong assumption and introduce an explicit mapping between the low-level and the high-level traces that the high-level trace perceives as a clock defined in terms of properties of segments of the low-level one. We investigate the case of regular mappings, where the segments that induce clock ticks are specified by a regular language property or a finite-state machine. We show that if both the clock and the high-level specifications are expressed as finite-state machines, such as reward machines, we can combine the two specifications in polynomial time into a single machine incorporating the clock. We then investigate the case in which both the clock and the high-level task are specified declaratively, e.g., in linear temporal logics on finite traces such as ltlf and ldlf, and show that this yields a notable representational advantage wrt a flattened representation where the clock is not explicit.

Regular Clocks for Temporal Task Specifications in Reinforcement Learning / De Giacomo, G.; Favorito, M.; Patrizi, F.. - 15450:(2025), pp. 147-160. ( 23rd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2024 Bolzano; Italy ) [10.1007/978-3-031-80607-0_12].

Regular Clocks for Temporal Task Specifications in Reinforcement Learning

De Giacomo G.;Favorito M.;Patrizi F.
2025

Abstract

Several recent approaches in reinforcement learning are studying a conceptual architecture where the environment is simultaneously represented at two (or more) levels of abstraction, with the environment providing two traces of data/events/features/fluents, one at a lower-level/finer grain and one at a higher-level/coarser grain. For simplicity, most of this literature assumes that the instants of the two traces match. In this paper, we drop this strong assumption and introduce an explicit mapping between the low-level and the high-level traces that the high-level trace perceives as a clock defined in terms of properties of segments of the low-level one. We investigate the case of regular mappings, where the segments that induce clock ticks are specified by a regular language property or a finite-state machine. We show that if both the clock and the high-level specifications are expressed as finite-state machines, such as reward machines, we can combine the two specifications in polynomial time into a single machine incorporating the clock. We then investigate the case in which both the clock and the high-level task are specified declaratively, e.g., in linear temporal logics on finite traces such as ltlf and ldlf, and show that this yields a notable representational advantage wrt a flattened representation where the clock is not explicit.
2025
23rd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2024
Clock specification; Reinforcement Learning; Temporal Tasks
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Regular Clocks for Temporal Task Specifications in Reinforcement Learning / De Giacomo, G.; Favorito, M.; Patrizi, F.. - 15450:(2025), pp. 147-160. ( 23rd International Conference of the Italian Association for Artificial Intelligence, AIxIA 2024 Bolzano; Italy ) [10.1007/978-3-031-80607-0_12].
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1738693
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact