Fault tolerance is a key requirement in several application domains of embedded processors cores. In a wide variety of applications, however, a full protection against faults occurring in any bit of the core may be oversized, and it has been demonstrated that the system level impact of local faults in the microprocessor chips also depends on the program being executed. As a result, it is relevant to study the fault injection resilience of a processor hardware design with an application-oriented methodology. Previous studies addressed either physical fault injection on FPGA prototypes, or RTL analysis and mixed-level approaches involving UVM, SystemC and DSL libraries. These methods are based on massive random error injection requiring impractical amounts of time, often limited to specific architecture sub-parts. In this work we present the advantages of an RTL, deterministic bit-level cycle-accurate fault injection analysis implemented in a pure UVM Environment. The approach allows characterizing the fault resilience of each bit of the microarchitecture at application level, paving the way to a subsequent customized protection based on the upper bound of error probability. Also, the characterization detects the time intervals corresponding to critical section of the program execution for each bit of the microarchitecture, sometimes leading to unexpected results. We discuss the advantages of a hierarchical time frame span of the execution time with injected faults rather than a uniform timing distribution of faults, and we set up the error classification methodology according to how each faulty bit can damage the system in different execution time sections. We carry out our experiments targeting the Klessydra T03 RISC-V open-source processor core, covering all of the 5561 register bits and characterizing two benchmark program executions, in less than 100 hours' simulation.

Fault resilience analysis of a RISC-V microprocessor design through a dedicated UVM environment / Barbirotta, M.; Mastrandrea, A.; Menichelli, F.; Vigli, F.; Blasi, L.; Cheikh, A.; Sordillo, S.; Di Gennaro, F.; Olivieri, M.. - (2020), pp. 1-6. (Intervento presentato al convegno 33rd IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2020 tenutosi a Online; Italy) [10.1109/DFT50435.2020.9250871].

Fault resilience analysis of a RISC-V microprocessor design through a dedicated UVM environment

Barbirotta M.;Mastrandrea A.;Menichelli F.;Vigli F.;Blasi L.;Cheikh A.;Sordillo S.;Olivieri M.
2020

Abstract

Fault tolerance is a key requirement in several application domains of embedded processors cores. In a wide variety of applications, however, a full protection against faults occurring in any bit of the core may be oversized, and it has been demonstrated that the system level impact of local faults in the microprocessor chips also depends on the program being executed. As a result, it is relevant to study the fault injection resilience of a processor hardware design with an application-oriented methodology. Previous studies addressed either physical fault injection on FPGA prototypes, or RTL analysis and mixed-level approaches involving UVM, SystemC and DSL libraries. These methods are based on massive random error injection requiring impractical amounts of time, often limited to specific architecture sub-parts. In this work we present the advantages of an RTL, deterministic bit-level cycle-accurate fault injection analysis implemented in a pure UVM Environment. The approach allows characterizing the fault resilience of each bit of the microarchitecture at application level, paving the way to a subsequent customized protection based on the upper bound of error probability. Also, the characterization detects the time intervals corresponding to critical section of the program execution for each bit of the microarchitecture, sometimes leading to unexpected results. We discuss the advantages of a hierarchical time frame span of the execution time with injected faults rather than a uniform timing distribution of faults, and we set up the error classification methodology according to how each faulty bit can damage the system in different execution time sections. We carry out our experiments targeting the Klessydra T03 RISC-V open-source processor core, covering all of the 5561 register bits and characterizing two benchmark program executions, in less than 100 hours' simulation.
2020
33rd IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2020
fault tolerance; microprocessors; UVM
04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
Fault resilience analysis of a RISC-V microprocessor design through a dedicated UVM environment / Barbirotta, M.; Mastrandrea, A.; Menichelli, F.; Vigli, F.; Blasi, L.; Cheikh, A.; Sordillo, S.; Di Gennaro, F.; Olivieri, M.. - (2020), pp. 1-6. (Intervento presentato al convegno 33rd IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2020 tenutosi a Online; Italy) [10.1109/DFT50435.2020.9250871].
File allegati a questo prodotto
File Dimensione Formato  
Barbirotta_Fault-resilience_2020.pdf

solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 591.64 kB
Formato Adobe PDF
591.64 kB Adobe PDF   Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1540143
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? 6
social impact