Catalogo dei prodotti della ricerca

We use Reinforcement Learning, in particular a deep Q-Learning algorithm and an adaptation of two actor-critic algorithms (Proximal Policy Optimization and Phasic Policy Gradient), originally proposed for robotic control, to solve the metric Traveling Salesperson Problem (TSP). We introduce a convolutional model to approximate action-value and state-value functions, centered on the idea of considering a weighted incidence matrix as the agent’s graph representation at a given instant. Our computational experience shows that Q-Learning does not seem to be adequate to solve the TSP, but nevertheless we find that both PPO and PPG can achieve the same solution of standard optimization algorithms with a smaller computational effort; we also find that our trained models are able to orient themselves through new unseen graphs and with different costs distributions.

Heuristics for the Traveling Salesperson Problem based on Reinforcement Learning / Coppola, Corrado; Grani, Giorgio; Monaci, Marta; Palagi, Laura. - (2021).

Heuristics for the Traveling Salesperson Problem based on Reinforcement Learning

Corrado Coppola^Primo;Giorgio Grani^Secondo;Marta Monaci^Penultimo;Laura Palagi^Ultimo

2021

Abstract

We use Reinforcement Learning, in particular a deep Q-Learning algorithm and an adaptation of two actor-critic algorithms (Proximal Policy Optimization and Phasic Policy Gradient), originally proposed for robotic control, to solve the metric Traveling Salesperson Problem (TSP). We introduce a convolutional model to approximate action-value and state-value functions, centered on the idea of considering a weighted incidence matrix as the agent’s graph representation at a given instant. Our computational experience shows that Q-Learning does not seem to be adequate to solve the TSP, but nevertheless we find that both PPO and PPG can achieve the same solution of standard optimization algorithms with a smaller computational effort; we also find that our trained models are able to orient themselves through new unseen graphs and with different costs distributions.

Scheda breve

Scheda completa

Anno di pubblicazione

2021

Appartiene alla tipologia:

13a Altro ministeriale

File allegati a questo prodotto

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1565049

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

ND

ND

social impact