The increased performance of LLMs has allowed them to be employed in a wide variety of language-related tasks. In this work, we propose a robot-agnostic Cognitive Architecture for Human-Robot Interaction (HRI) that allows the robot on which it is mounted to reason about its embodiment and the environment around it to decide how to act during interactions with humans. Our architecture includes a long-term memory, which allows the robot to remember past information, and a series of modules called Supervisors. The supervisors role is to orchestrate the interaction process in order to ensure that the robot's behavior does not diverge from the desired one, respecting security criteria that depend on different factors, such as the domain in which the robot is located, the users, and the robot embodiment. To highlight the adaptability of our architecture, we tested it on four different robots, each with a different set of skills. We evaluated this architecture during a dialogue between two robots, NAO and SMARRtino, in which they had to reason about their embodiment and explain to each other what they can do with it.
A Cognitive Architecture for Embodied AI based on LLM Common-sense Knowledge / Saladino, Alessio; Brienza, Michele; Suriani, Vincenzo; Bloisi, Domenico Daniele; Iocchi, Luca. - (2025).
A Cognitive Architecture for Embodied AI based on LLM Common-sense Knowledge
Alessio Saladino;Michele Brienza;Vincenzo Suriani;Domenico Daniele Bloisi;Luca Iocchi
2025
Abstract
The increased performance of LLMs has allowed them to be employed in a wide variety of language-related tasks. In this work, we propose a robot-agnostic Cognitive Architecture for Human-Robot Interaction (HRI) that allows the robot on which it is mounted to reason about its embodiment and the environment around it to decide how to act during interactions with humans. Our architecture includes a long-term memory, which allows the robot to remember past information, and a series of modules called Supervisors. The supervisors role is to orchestrate the interaction process in order to ensure that the robot's behavior does not diverge from the desired one, respecting security criteria that depend on different factors, such as the domain in which the robot is located, the users, and the robot embodiment. To highlight the adaptability of our architecture, we tested it on four different robots, each with a different set of skills. We evaluated this architecture during a dialogue between two robots, NAO and SMARRtino, in which they had to reason about their embodiment and explain to each other what they can do with it.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


