Transformer Networks have established themselves as the de-facto state-of-the-art for trajectory forecasting but there is currently no systematic study on their capability to model the motion patterns of people, without interactions with other individuals nor the social context. This paper proposes the first in-depth study of Transformer Networks (TF) and Bidirectional Transformers (BERT) for the forecasting of the individual motion of people, without bells and whistles. We conduct an exhaustive evaluation of input/output representations, problem formulations and sequence modeling, including a novel analysis of their capability to predict multi-modal futures. Out of comparative evaluation on the ETH+UCY benchmark, both TF and BERT are top performers in predicting individual motions, definitely overcoming RNNs and LSTMs. Furthermore, they remain within a narrow margin wrt more complex techniques, which include both social interactions and scene contexts. Source code will be released for all conducted experiments.

Under the hood of transformer networks for trajectory forecasting / Franco, Luca; Placidi, Leonardo; Giuliari, Francesco; Hasan, Irtiza; Cristani, Marco; Galasso, Fabio. - In: PATTERN RECOGNITION. - ISSN 0031-3203. - 138:(2023), p. 109372. [10.1016/j.patcog.2023.109372]

Under the hood of transformer networks for trajectory forecasting

Luca Franco
Co-primo
;
Leonardo Placidi
Co-primo
;
Fabio Galasso
Ultimo
2023

Abstract

Transformer Networks have established themselves as the de-facto state-of-the-art for trajectory forecasting but there is currently no systematic study on their capability to model the motion patterns of people, without interactions with other individuals nor the social context. This paper proposes the first in-depth study of Transformer Networks (TF) and Bidirectional Transformers (BERT) for the forecasting of the individual motion of people, without bells and whistles. We conduct an exhaustive evaluation of input/output representations, problem formulations and sequence modeling, including a novel analysis of their capability to predict multi-modal futures. Out of comparative evaluation on the ETH+UCY benchmark, both TF and BERT are top performers in predicting individual motions, definitely overcoming RNNs and LSTMs. Furthermore, they remain within a narrow margin wrt more complex techniques, which include both social interactions and scene contexts. Source code will be released for all conducted experiments.
2023
trajectory forecasting; human behavior; transformer networks; bert; multi-modal future prediction; neural networks; deep learning; computer vision
01 Pubblicazione su rivista::01a Articolo in rivista
Under the hood of transformer networks for trajectory forecasting / Franco, Luca; Placidi, Leonardo; Giuliari, Francesco; Hasan, Irtiza; Cristani, Marco; Galasso, Fabio. - In: PATTERN RECOGNITION. - ISSN 0031-3203. - 138:(2023), p. 109372. [10.1016/j.patcog.2023.109372]
File allegati a questo prodotto
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1673543
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 7
social impact