Hussien Al-haj Ahmad

Iran Ferdowsi University of Mashhad

Published in 2022
Software-based Control-Flow Error Detection with Hardware Performance Counters in ARM Processors
hardware performance counter control-flow checking safety-critical systems software-based error detection
Authors: Hussien Al-haj Ahmad, Yasser Sedaghat
Journal: 2022 CPSSI 4th International Symposium on Real-Time and Embedded Systems and Technologies (RTEST)
Description:
The recent trend in processor manufacturing technologies has significantly increased the susceptibility of safety-critical systems against soft errors in harsh environments. Such errors result in control-flow errors (CFEs) that can disturb systems' execution and cause severe financial, human, or environmental disasters. Therefore, there is a severe need for efficient techniques to detect CFEs and keep the systems fault-tolerant. Although numerous control-flow error detection techniques have been proposed, they impose considerable overheads, making them inappropriate for today's safety-critical systems with restricted resources. Several techniques attempt to insert fewer control-flow checking instructions to reduce overheads. However, they limit fault coverage. This paper proposes a software-based technique for ARM processors to detect CFEs. The technique leverages the Hardware Performance Counters (HPCs), which exist in most modern processors, to count micro-architecture events and generate HPC-based signatures. Based on these signatures that capture the correct control flow of the program, the proposed technique can detect CFEs once the correct control flow is violated. We evaluate the detection capability of the proposed technique by performing many fault injection experiments applied on different benchmark programs. Moreover, we compare the proposed technique with common signature-based CFE detection techniques with respect to fault coverage and imposed overheads. The results demonstrate that the proposed technique on average can achieve ~99% fault coverage which is 23.57% higher than that offered by the employed signature-based techniques. Moreover, the memory overhead imposed by the proposed technique is 4.85% lower, and the performance overhead is ~19% lower than that of the studied signature-based techniques.
Published in 2022
GCFI: A High Accurate Compiler-based Fault Injection for Transient Hardware Faults
compiler-based fault injection assessing resilience transient hardware faults compiler extension
Authors: Hussien Al-haj Ahmad, Yasser Sedaghat
Journal: 2022 CPSSI 4th International Symposium on Real-Time and Embedded Systems and Technologies (RTEST)
Description:
Recently, with increasing system complexity and advanced technology scaling, there is a severe need for accurate fault injection (FI) techniques in the reliability evaluation of safety-critical systems against transient hardware faults, like soft errors. Since compiler-based FI techniques operate at a high intermediate representation (IR) code, their accuracy is insufficient to assess the resilience of safety-critical systems against soft errors. Although binary-level FI techniques can provide high accuracy, error propagation analysis is challenging due to missing program structures. This paper proposes an accurate GCC compiler-based FI technique called (GCFI) to assess the resilience of software against soft errors. GCFI operates at the back-end of the GCC compiler and instruments the very low-level IR code through a compiler extension. GCFI only performs instrumentation once right after the completion of optimization passes, assuring one-to-one correspondence of IR code with assembly code. The effectiveness of GCFI is evaluated by employing it to conduct many FI experiments on different benchmarks compiled for x86 and ARM architectures. We compare the results with high-level and binary-level software FI techniques to demonstrate the accuracy of GCFI. The results show that GCFI can assess the resilience of programs against soft errors with high accuracy similar to binary-level FI.
Published in 2022
CAFI: A Configurable location-Aware Fault Injection technique for software reliability assessment against soft errors
Fault injection Soft error Reliability assessment Fault tolerance
Authors: Hussien Al-haj Ahmad, Yasser Sedaghat
Journal: Microprocessors and Microsystems
Description:
The rapid development of processor manufacturing technologies has encouraged most designers to highlight the processors' resilience to errors. Regarding embedded systems in harsh environments, temporary faults, such as single-event upsets (SEUs) or single-event multiple upsets (SEMUs), induced by radiation effects on a memory cell or a combinational logic circuit, can cause perturbations in the system's behavior, which in turn can cause catastrophic consequences. In this context, Fault Injection (FI) has been successfully applied as a mature method to assess reliability against faults and reveal the system's deficiencies to be protected cost-effectively. Although high-level software FI techniques are less accurate, the high accuracy of fault injection at a low level usually comes at the expense of desirable characteristics related to portability, flexibility, and intrusiveness. Hence, this paper proposes a Configurable location-Aware Fault Injection (CAFI) technique. CAFI is a software-based technique that facilitates emulating different kinds of transient hardware faults, e.g., SEUs and SEMUs, at the software level. Therefore, it supports flexibility in conducting reliability assessment studies with varying fault models. It operates at the binary code level and exploits a timing-based mechanism to inject faults at run-time in a negligible intrusive manner. CAFI can inject faults at different granularity levels to maximize fault activation. In detail, a fine-grained into a specific instruction field and a coarse-grained into the whole system's software. CAFI requires negligible modifications to the target software under test and allows it to run at near-native speed. The effectiveness of CAFI is evaluated by conducting many fault injection experiments applying to different real-world benchmark programs. Moreover, the accuracy of CAFI is quantified with respect to high-level software fault injection. For this aim, a practical prototype of CAFI is implemented on the x86 architecture employing an Intel core i7 processor with 16GB RAM. Based on a total of 108,000 fault injection experiments, the rate of program crashes caused by CAFI is slightly more than 18% higher than that caused by high-level software fault injection. On the other hand, there are no significant differences between the silent data corruptions (SDCs), results obtained by CAFI, and high-level software fault injections, making CAFI applicable for studying crash- and SDC-causing errors.
Published in 2019
LDSFI: a Lightweight Dynamic Software-based Fault Injection
dynamic binary injection fault injection soft errors software-implemented fault injection
Authors: Hussien Al-haj Ahmad, Yasser Sedaghat, Mahin Moradiyan
Journal: 2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)
Description:
Recently, numerous safety-critical systems have employed a variety of fault tolerance techniques, which are considered an essential requirement to keep the system fault-tolerant. While the current trend in processors technology has increased their effectiveness and performance, the sensitivity of processors to soft errors has increased significantly, making their fault tolerance ability questionable. In this context, fault injection is considered as one of the most popular, rapid, and cost-effective techniques which enables the designers to assess the fault tolerance of systems under faults before their deployment. In this paper, a pure software fault injection technique called LDSFI (a Lightweight Dynamic Software-based Fault Injection) is presented and evaluated. Due to the dynamic aspect of LDSFI, faults are automatically injected into binary code at runtime. Thereby, the proposed technique does not impose any program runtime overhead since the intended source code is not required. The effectiveness of LDSFI was validated through performing exhaustive fault injection experiments using well-known benchmarks. The experiments were carried out using a Core 2 Duo processor, as an Intel x86 Dual-Core PC with 4GB RAM running Ubuntu Linux 14.04 with the GNU Compiler Collection (GCC) version 4.9. Since LDSFI relies on the GNU, it is highly portable and can be adapted for different platforms.
Published in 2017
A performance counter-based control flow checking technique for multi-core processors
Soft errors Control-flow errors Control-flow checking
Authors: Hussien Al-haj Ahmad, Yasser Sedaghat, Mohammadreza Rezaei
Journal: Conference: 2017 7th International Conference on Computer and Knowledge Engineering (ICCKE)
Description:
Today, both the rapid improvement of process technology and the arrival of new embedded systems with high- performance requirements, have led to making the current trend in processors manufacturing shift from single-core processors to multi-core processors. This trend has raised several challenges for reliability in safety-critical systems that operate in high-risk environments, making them more vulnerable to soft errors. Hence, using additional methods to satisfy the strict system requirements in terms of safety and reliability is unavoidable. In this paper, an efficient hybrid method to detect control flow errors in multi-core processors has been proposed and evaluated. About 36,000 software faults have been injected into three well- known multi-threaded benchmarks at run-time. The experiment results show that the fault coverage is 100%. The results also show that the execution time overhead varies between 31.25% and 51.02% and the program size overhead varies between 20.23% and 67.64% with respect to the employed benchmark.