Fault tolerance is a key requirement in several application domains of embedded processors cores. In a wide variety of applications, however, a full protection against faults occurring in any bit of the core may be oversized, and it has been demonstrated that the system level impact of local faults in the microprocessor chips also depends on the program being executed. As a result, it is relevant to study the fault injection resilience of a processor hardware design with an application-oriented methodology. Previous studies addressed either physical fault injection on FPGA prototypes, or RTL analysis and mixed-level approaches involving UVM, SystemC and DSL libraries. These methods are based on massive random error injection requiring impractical amounts of time, often limited to specific architecture sub-parts. In this work we present the advantages of an RTL, deterministic bit-level cycle-accurate fault injection analysis implemented in a pure UVM Environment. The approach allows characterizing the fault resilience of each bit of the microarchitecture at application level, paving the way to a subsequent customized protection based on the upper bound of error probability. Also, the characterization detects the time intervals corresponding to critical section of the program execution for each bit of the microarchitecture, sometimes leading to unexpected results. We discuss the advantages of a hierarchical time frame span of the execution time with injected faults rather than a uniform timing distribution of faults, and we set up the error classification methodology according to how each faulty bit can damage the system in different execution time sections. We carry out our experiments targeting the Klessydra T03 RISC-V open-source processor core, covering all of the 5561 register bits and characterizing two benchmark program executions, in less than 100 hours' simulation.
Fault resilience analysis of a RISC-V microprocessor design through a dedicated UVM environment / Barbirotta, M.; Mastrandrea, A.; Menichelli, F.; Vigli, F.; Blasi, L.; Cheikh, A.; Sordillo, S.; Di Gennaro, F.; Olivieri, M.. - (2020), pp. 1-6. (Intervento presentato al convegno 33rd IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2020 tenutosi a Online; Italy) [10.1109/DFT50435.2020.9250871].
Fault resilience analysis of a RISC-V microprocessor design through a dedicated UVM environment
Barbirotta M.;Mastrandrea A.;Menichelli F.;Vigli F.;Blasi L.;Cheikh A.;Sordillo S.;Olivieri M.
2020
Abstract
Fault tolerance is a key requirement in several application domains of embedded processors cores. In a wide variety of applications, however, a full protection against faults occurring in any bit of the core may be oversized, and it has been demonstrated that the system level impact of local faults in the microprocessor chips also depends on the program being executed. As a result, it is relevant to study the fault injection resilience of a processor hardware design with an application-oriented methodology. Previous studies addressed either physical fault injection on FPGA prototypes, or RTL analysis and mixed-level approaches involving UVM, SystemC and DSL libraries. These methods are based on massive random error injection requiring impractical amounts of time, often limited to specific architecture sub-parts. In this work we present the advantages of an RTL, deterministic bit-level cycle-accurate fault injection analysis implemented in a pure UVM Environment. The approach allows characterizing the fault resilience of each bit of the microarchitecture at application level, paving the way to a subsequent customized protection based on the upper bound of error probability. Also, the characterization detects the time intervals corresponding to critical section of the program execution for each bit of the microarchitecture, sometimes leading to unexpected results. We discuss the advantages of a hierarchical time frame span of the execution time with injected faults rather than a uniform timing distribution of faults, and we set up the error classification methodology according to how each faulty bit can damage the system in different execution time sections. We carry out our experiments targeting the Klessydra T03 RISC-V open-source processor core, covering all of the 5561 register bits and characterizing two benchmark program executions, in less than 100 hours' simulation.File | Dimensione | Formato | |
---|---|---|---|
Barbirotta_Fault-resilience_2020.pdf
solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza:
Tutti i diritti riservati (All rights reserved)
Dimensione
591.64 kB
Formato
Adobe PDF
|
591.64 kB | Adobe PDF | Contatta l'autore |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.