In multi-robot reinforcement learning the goal is to enable a team of robots to learn a coordinated behavior from direct interaction with the environment. Here, we provide a comparison of the two main approaches to tackle this challenge, namely independent learners (IL) and joint-action learners (JAL). IL is suitable for highly scalable domains, but it faces non-stationarity issues. Whereas, JAL overcomes non-stationarity and can generate highly coordinated behaviors, but it presents scalability issues due to the increased size of the search space. We implement and evaluate these methods in a new multi-robot cooperative and adversarial soccer scenario, called 2 versus 2 free-kick task, where scalability issues affecting JAL are less relevant given the small number of learners. In this work, we implement and deploy these methodologies on a team of simulated NAO humanoid robots. We describe the implementation details of our scenario and show that both approaches are able to achieve satisfying solutions. Notably, we observe joint-action learners to have a better performance than independent learners in terms of success rate and quality of the learned policies. Finally, we discuss the results and provide conclusions based on our findings.

Cooperative Multi-agent Deep Reinforcement Learning in a 2 Versus 2 Free-Kick Task / Catacora Ocana, J. M.; Riccio, F.; Capobianco, R.; Nardi, D.. - 11531:(2019), pp. 44-57. (Intervento presentato al convegno 23rd Annual RoboCup International Symposium, RoboCup 2019 tenutosi a Sydney; Australia) [10.1007/978-3-030-35699-6_4].

Cooperative Multi-agent Deep Reinforcement Learning in a 2 Versus 2 Free-Kick Task

Catacora Ocana J. M.
;
Riccio F.;Capobianco R.;Nardi D.
2019

Abstract

In multi-robot reinforcement learning the goal is to enable a team of robots to learn a coordinated behavior from direct interaction with the environment. Here, we provide a comparison of the two main approaches to tackle this challenge, namely independent learners (IL) and joint-action learners (JAL). IL is suitable for highly scalable domains, but it faces non-stationarity issues. Whereas, JAL overcomes non-stationarity and can generate highly coordinated behaviors, but it presents scalability issues due to the increased size of the search space. We implement and evaluate these methods in a new multi-robot cooperative and adversarial soccer scenario, called 2 versus 2 free-kick task, where scalability issues affecting JAL are less relevant given the small number of learners. In this work, we implement and deploy these methodologies on a team of simulated NAO humanoid robots. We describe the implementation details of our scenario and show that both approaches are able to achieve satisfying solutions. Notably, we observe joint-action learners to have a better performance than independent learners in terms of success rate and quality of the learned policies. Finally, we discuss the results and provide conclusions based on our findings.
2019
23rd Annual RoboCup International Symposium, RoboCup 2019
Reinforcement learning; Learning algorithms; Policy gradient
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Cooperative Multi-agent Deep Reinforcement Learning in a 2 Versus 2 Free-Kick Task / Catacora Ocana, J. M.; Riccio, F.; Capobianco, R.; Nardi, D.. - 11531:(2019), pp. 44-57. (Intervento presentato al convegno 23rd Annual RoboCup International Symposium, RoboCup 2019 tenutosi a Sydney; Australia) [10.1007/978-3-030-35699-6_4].
File allegati a questo prodotto
File Dimensione Formato  
CatacoraOcana_Cooperative_2019.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.26 MB
Formato Adobe PDF
1.26 MB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1382008
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 3
social impact