Policies of Markov Decision Processes (MDPs) tell the next action to execute, given the current state and (possibly) the history of actions executed so far. Factorization is used when the number of states is exponentially large: both the MDP and the policy can be then represented using a compact form, for example employing circuits. We prove that there are MDPs whose optimal policies require exponential space even in factored form.
The size of MDP factored policies / Liberatore, Paolo. - (2002), pp. 267-272. (Intervento presentato al convegno 18th National Conference on Artificial Intelligence (AAAI-02), 14th Innovative Applications of Artificial Intelligence Conference (IAAI-02) tenutosi a Edmonton, Alberta; Canada nel 28 July - 01 August 2002).
The size of MDP factored policies
LIBERATORE, Paolo
2002
Abstract
Policies of Markov Decision Processes (MDPs) tell the next action to execute, given the current state and (possibly) the history of actions executed so far. Factorization is used when the number of states is exponentially large: both the MDP and the policy can be then represented using a compact form, for example employing circuits. We prove that there are MDPs whose optimal policies require exponential space even in factored form.File | Dimensione | Formato | |
---|---|---|---|
VE_2002_11573-206817.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
816.7 kB
Formato
Adobe PDF
|
816.7 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.