Neural algorithmic reasoning studies the problem of learning algorithms with neural networks, especially using graph architectures. A recent proposal, XLVIN, reaps the benefits of using a graph neural network that simulates the value iteration algorithm in deep reinforcement learning agents. It allows model-free planning without access to privileged information about the environment, which is usually unavailable. However, XLVIN only supports discrete action spaces, and is hence nontrivially applicable to most tasks of real-world interest. We expand XLVIN to continuous action spaces by discretization, and evaluate several selective expansion policies to deal with the large planning graphs. Our proposal, CNAP, demonstrates how neural algorithmic reasoning can make a measurable impact in higher-dimensional continuous control settings, such as MuJoCo, bringing gains in low-data settings and outperforming model-free baselines.
Continuous Neural Algorithmic Planners / He, Y.; Velickovic, P.; Lio, P.; Deac, A.. - 198:(2022). (Intervento presentato al convegno 1st Learning on Graphs Conference, LOG 2022 tenutosi a Virtual, Online).
Continuous Neural Algorithmic Planners
Lio P.
;
2022
Abstract
Neural algorithmic reasoning studies the problem of learning algorithms with neural networks, especially using graph architectures. A recent proposal, XLVIN, reaps the benefits of using a graph neural network that simulates the value iteration algorithm in deep reinforcement learning agents. It allows model-free planning without access to privileged information about the environment, which is usually unavailable. However, XLVIN only supports discrete action spaces, and is hence nontrivially applicable to most tasks of real-world interest. We expand XLVIN to continuous action spaces by discretization, and evaluate several selective expansion policies to deal with the large planning graphs. Our proposal, CNAP, demonstrates how neural algorithmic reasoning can make a measurable impact in higher-dimensional continuous control settings, such as MuJoCo, bringing gains in low-data settings and outperforming model-free baselines.File | Dimensione | Formato | |
---|---|---|---|
He_Continuous_2022.pdf
accesso aperto
Note: https://proceedings.mlr.press/v198/he22a/he22a.pdf
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Creative commons
Dimensione
6.46 MB
Formato
Adobe PDF
|
6.46 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.