We propose a method for generating and learning agent controllers, which combines techniques from automated planning and reinforcement learning. An incomplete description of the domain is first used to generate a non-deterministic automaton able to act (sub-optimally) in the given environment. Such a controller is then refined through experience, by learning choices at non-deterministic points. On the one hand, the incompleteness of the model, which would make a pure-planning approach ineffective, is overcome through learning. On the other hand, the portion of the domain available drives the learning process, that otherwise would be excessively expensive. Our method allows to adapt the behavior of a given planner to the environment, facing the unavoidable discrepancies between the model and the environment. We provide quantitative experiments with a simulator of a mobile robot to assess the performance of the proposed method. © 2012 Springer-Verlag.
Automatic generation and learning of finite-state controllers / Leonetti, Matteo; Iocchi, Luca; Patrizi, Fabio. - 7557 LNAI:(2012), pp. 135-144. (Intervento presentato al convegno 15th International Conference on Artificial Intelligence: Methodology, Systems, and Applications, AIMSA 2012 tenutosi a Varna nel 12 September 2012 through 15 September 2012) [10.1007/978-3-642-33185-5_15].
Automatic generation and learning of finite-state controllers
LEONETTI, MATTEO;IOCCHI, Luca;PATRIZI, FABIO
2012
Abstract
We propose a method for generating and learning agent controllers, which combines techniques from automated planning and reinforcement learning. An incomplete description of the domain is first used to generate a non-deterministic automaton able to act (sub-optimally) in the given environment. Such a controller is then refined through experience, by learning choices at non-deterministic points. On the one hand, the incompleteness of the model, which would make a pure-planning approach ineffective, is overcome through learning. On the other hand, the portion of the domain available drives the learning process, that otherwise would be excessively expensive. Our method allows to adapt the behavior of a given planner to the environment, facing the unavoidable discrepancies between the model and the environment. We provide quantitative experiments with a simulator of a mobile robot to assess the performance of the proposed method. © 2012 Springer-Verlag.File | Dimensione | Formato | |
---|---|---|---|
VE_2012_11573-486478.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
424.16 kB
Formato
Adobe PDF
|
424.16 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.