Pushing back the frontiers of collaborative robots in industrial environments, we propose a new Separable-Sparse Graph Convolutional Network (SeS-GCN) for pose forecasting. For the first time, SeS-GCN bottlenecks the interaction of the spatial, temporal and channel-wise dimensions in GCNs, and it learns sparse adjacency matrices by a teacher-student framework. Compared to the state-of-the-art, it only uses 1.72% of the parameters and it is ∼4 times faster, while still performing comparably in forecasting accuracy on Human3.6M at 1 s in the future, which enables cobots to be aware of human operators. As a second contribution, we present a new benchmark of Cobots and Humans in Industrial COllaboration (CHICO ). CHICO includes multi-view videos, 3D poses and trajectories of 20 human operators and cobots, engaging in 7 realistic industrial actions. Additionally, it reports 226 genuine collisions, taking place during the human-cobot interaction. We test SeS-GCN on CHICO for two important perception tasks in robotics: human pose forecasting, where it reaches an average error of 85.3 mm (MPJPE) at 1 sec in the future with a run time of 2.3 ms, and collision detection, by comparing the forecasted human motion with the known cobot motion, obtaining an F1-score of 0.64.
Pose Forecasting in Industrial Human-Robot Collaboration / Sampieri, Alessio; D'AMELY DI MELENDUGNO, GUIDO MARIA; Avogaro, Andrea; Cunico, Federico; Setti, Francesco; Skenderi, Geri; Cristani, Marco; Galasso, Fabio. - 13698:(2022), pp. 51-69. (Intervento presentato al convegno European Conference on Computer Vision tenutosi a Tel Aviv, Israel) [10.1007/978-3-031-19839-7_4].
Pose Forecasting in Industrial Human-Robot Collaboration
Alessio SampieriPrimo
;Guido Maria D’Amely di Melendugno
;Fabio GalassoUltimo
2022
Abstract
Pushing back the frontiers of collaborative robots in industrial environments, we propose a new Separable-Sparse Graph Convolutional Network (SeS-GCN) for pose forecasting. For the first time, SeS-GCN bottlenecks the interaction of the spatial, temporal and channel-wise dimensions in GCNs, and it learns sparse adjacency matrices by a teacher-student framework. Compared to the state-of-the-art, it only uses 1.72% of the parameters and it is ∼4 times faster, while still performing comparably in forecasting accuracy on Human3.6M at 1 s in the future, which enables cobots to be aware of human operators. As a second contribution, we present a new benchmark of Cobots and Humans in Industrial COllaboration (CHICO ). CHICO includes multi-view videos, 3D poses and trajectories of 20 human operators and cobots, engaging in 7 realistic industrial actions. Additionally, it reports 226 genuine collisions, taking place during the human-cobot interaction. We test SeS-GCN on CHICO for two important perception tasks in robotics: human pose forecasting, where it reaches an average error of 85.3 mm (MPJPE) at 1 sec in the future with a run time of 2.3 ms, and collision detection, by comparing the forecasted human motion with the known cobot motion, obtaining an F1-score of 0.64.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.