“Strong-cyclic policies” were introduced to formalize trial-and-error strategies and are known to work in Markovian stochastic domains, i.e., they guarantee that the goal is reached with probability 1. We introduce “best-effort” policies for (not necessarily Markovian) stochastic domains. These generalize strong-cyclic policies by taking advantage of stochasticity even if the goal cannot be reached with probability 1. We compare such policies with optimal policies, i.e., policies that maximize the probability that the goal is achieved, and show that optimal policies are best-effort, but that the converse is false in general. With this framework at hand, we revisit the foundational problem of what it means to plan in nondeterministic domains when the nondeterminism has a stochastic nature. We show that one can view a nondeterministic planning domain as a representation of infinitely many stochastic domains with the same support but different probabilities, and that for temporally extended goals expressed in LTL/LTLf a finite-state best-effort policy in one of these domains is best-effort in each of the domains. In particular, this gives an approach for finding such policies that reduces to solving finite-state MDPs with LTL/LTLf goals. All this shows that “best-effort” policies are robust to changes in the probabilities, as long as the support is unchanged.
Beyond Strong-Cyclic: Doing Your Best in Stochastic Environments / Aminof, B.; De Giacomo, G.; Rubin, S.; Zuleger, F.. - In: IJCAI. - ISSN 1045-0823. - (2022), pp. 2525-2531. (Intervento presentato al convegno International Joint Conferences on Artificial Intelligence tenutosi a Wien; Austria) [10.24963/ijcai.2022/350].
Beyond Strong-Cyclic: Doing Your Best in Stochastic Environments
Aminof B.
;De Giacomo G.
;Rubin S.
;
2022
Abstract
“Strong-cyclic policies” were introduced to formalize trial-and-error strategies and are known to work in Markovian stochastic domains, i.e., they guarantee that the goal is reached with probability 1. We introduce “best-effort” policies for (not necessarily Markovian) stochastic domains. These generalize strong-cyclic policies by taking advantage of stochasticity even if the goal cannot be reached with probability 1. We compare such policies with optimal policies, i.e., policies that maximize the probability that the goal is achieved, and show that optimal policies are best-effort, but that the converse is false in general. With this framework at hand, we revisit the foundational problem of what it means to plan in nondeterministic domains when the nondeterminism has a stochastic nature. We show that one can view a nondeterministic planning domain as a representation of infinitely many stochastic domains with the same support but different probabilities, and that for temporally extended goals expressed in LTL/LTLf a finite-state best-effort policy in one of these domains is best-effort in each of the domains. In particular, this gives an approach for finding such policies that reduces to solving finite-state MDPs with LTL/LTLf goals. All this shows that “best-effort” policies are robust to changes in the probabilities, as long as the support is unchanged.File | Dimensione | Formato | |
---|---|---|---|
Aminof_Beyond-Strong-Cyclic_2022.pdf
accesso aperto
Note: https://www.ijcai.org/proceedings/2022/0350.pdf
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
169.93 kB
Formato
Adobe PDF
|
169.93 kB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.