As the Internet of Things expands, embedding Artificial Intelligence algorithms in resource-constrained devices has become increasingly important to enable real-time, autonomous decision-making without relying on centralized cloud servers. However, implementing and executing complex algorithms in embedded devices poses significant challenges due to limited computational power, memory, and energy resources. 2This article presents algorithmic and hardware techniques to efficiently implement two LinearUCB Contextual Bandits algorithms on resource-constrained embedded devices. Algorithmic modifications based on the Sherman–Morrison–Woodbury formula streamline model complexity, while vector acceleration is harnessed to speed up matrix operations. We analyze the impact of each optimization individually and then combine them in a two-pronged strategy. The results show notable improvements in execution time and energy consumption, demonstrating the effectiveness of combining algorithmic and hardware optimizations to enhance learning models for edge computing environments with low-power and real-time requirements.

Efficient implementation of linearUCB through algorithmic improvements and vector computing acceleration for embedded learning systems / Angioli, Marco; Barbirotta, Marcello; Cheikh, Abdallah; Mastrandrea, Antonio; Menichelli, Francesco; Olivieri, Mauro. - In: ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS. - ISSN 1539-9087. - 24:4(2025), pp. 1-23. [10.1145/3736226]

Efficient implementation of linearUCB through algorithmic improvements and vector computing acceleration for embedded learning systems

Marco Angioli
;
Marcello Barbirotta;Abdallah Cheikh;Antonio Mastrandrea;Francesco Menichelli;Mauro Olivieri
2025

Abstract

As the Internet of Things expands, embedding Artificial Intelligence algorithms in resource-constrained devices has become increasingly important to enable real-time, autonomous decision-making without relying on centralized cloud servers. However, implementing and executing complex algorithms in embedded devices poses significant challenges due to limited computational power, memory, and energy resources. 2This article presents algorithmic and hardware techniques to efficiently implement two LinearUCB Contextual Bandits algorithms on resource-constrained embedded devices. Algorithmic modifications based on the Sherman–Morrison–Woodbury formula streamline model complexity, while vector acceleration is harnessed to speed up matrix operations. We analyze the impact of each optimization individually and then combine them in a two-pronged strategy. The results show notable improvements in execution time and energy consumption, demonstrating the effectiveness of combining algorithmic and hardware optimizations to enhance learning models for edge computing environments with low-power and real-time requirements.
2025
LinearUCB algorithms; sherman morrison woodbury formula; vector hardware acceleration; embedded AI optimization
01 Pubblicazione su rivista::01a Articolo in rivista
Efficient implementation of linearUCB through algorithmic improvements and vector computing acceleration for embedded learning systems / Angioli, Marco; Barbirotta, Marcello; Cheikh, Abdallah; Mastrandrea, Antonio; Menichelli, Francesco; Olivieri, Mauro. - In: ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS. - ISSN 1539-9087. - 24:4(2025), pp. 1-23. [10.1145/3736226]
File allegati a questo prodotto
File Dimensione Formato  
Angioli_efficient-implementation-of_2025.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Creative commons
Dimensione 1.56 MB
Formato Adobe PDF
1.56 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1749837
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact