Article
Automation & Control Systems
Rajesh Devaraj, Arnab Sarkar
Summary: Real-time control applications are highly parallelizable and need to effectively utilize computing platforms to ensure system reliability in the presence of transient processor faults. Existing scheduling approaches for parallel applications are heuristic in nature, leading to suboptimal resource usage and increased design costs. Formal model-based safe design methodologies, such as supervisory control, are desirable for constructing fault-tolerant schedulers.
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS
(2021)
Article
Computer Science, Theory & Methods
Mohsen Ansari, Sepideh Safari, Heba Khdr, Pourya Gohari-Nazari, Joerg Henkel, Alireza Ejlali, Shaahin Hessabi
Summary: This article introduces a peak-power-aware checkpointing (PPAC) technique that tolerates faults and meets power constraints in hard real-time embedded systems. By adjusting the timing of checkpoints and utilizing the available slack times on the cores, the technique reduces peak power and saves energy.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
(2022)
Article
Computer Science, Hardware & Architecture
Behnaz Ranjbar, Ali Hosseinghorban, Mohammad Salehi, Alireza Ejlali, Akash Kumar
Summary: This article proposes a technique for managing peak-power consumption and temperature in mixed-criticality systems. The technique reduces the drop rate of low-criticality tasks and improves peak-power consumption and maximum temperature. It achieves this through a design-time tree of task mapping and scheduling, as well as runtime schedule selection based on fault occurrences and criticality mode changes.
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS
(2022)
Article
Computer Science, Hardware & Architecture
Sina Yari-Karin, Roozbeh Siyadatzadeh, Mohsen Ansari, Alireza Ejlali
Summary: In addition to real-time constraint, power/energy efficiency and high reliability are important objectives for real-time embedded systems. Heterogeneous multicore systems have been considered as a suitable solution for achieving joint power/energy efficiency and high reliability. However, power/energy and reliability are conflicting requirements due to fault-tolerance techniques. The proposed method in this article uses a passive primary/backup technique to maintain system reliability while reducing power/energy consumption in heterogeneous multicore systems. It maps primary and backup tasks in a mixed manner to take advantage of different core types and schedules backup tasks after primary tasks to avoid overlap. Experimental results demonstrate the power efficiency and effectiveness of our proposed method in terms of scheduling.
IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING
(2023)
Article
Computer Science, Hardware & Architecture
Pedro J. Lobo, Eduardo Juarez, Fernando Pescador, Cesar Sanz
Summary: This article introduces a development methodology for implementing software radio applications on heterogeneous multicore platforms, using the GNU radio toolkit, and validates the method by porting a DVB-T receiver to an embedded platform. By simultaneously accelerating two algorithms, the performance is improved by 63%.
IEEE CONSUMER ELECTRONICS MAGAZINE
(2021)
Article
Computer Science, Information Systems
Mohsen Ansari, Sepideh Safari, Sina Yari-Karin, Pourya Gohari-Nazari, Heba Khdr, Muhammad Shafique, Joerg Henkel, Alireza Ejlali
Summary: This paper proposes a thermal-aware standby-sparing technique that aims to maximize the Quality of Service (QoS) of real-time tasks while meeting power constraints and preventing thermal emergencies. The technique tolerates faults and reduces power consumption by removing overlaps between main and backup tasks. By employing a heterogeneous platform, the main tasks are executed on high-performance cores while the backup tasks are executed on low-power cores. Experiments show significant improvements in QoS, power consumption, and temperature reduction.
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING
(2022)
Article
Engineering, Marine
Kantapon Tanakitkorn, Surasak Phoemsapthawee
Summary: This paper investigates the performances of different thruster configurations for USVs and compares them in terms of manoeuvrability, power consumption, and fault tolerance. The results demonstrate that certain configurations are more suitable for path following while others are better for station keeping. The authors also introduce a new concept of a convertible thruster configuration to achieve high performance in both operating styles.
Article
Computer Science, Information Systems
Amir Yeganeh-Khaksar, Mohsen Ansari, Alireza Ejlali
Summary: This article proposes a method for mapping and scheduling periodic soft real-time tasks in multicore embedded systems to achieve a given reliability target while keeping the total power consumption under the chip TDP. Experimental results show that the proposed method can significantly reduce peak power consumption.
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING
(2022)
Article
Engineering, Electrical & Electronic
Marta Zarraga-Rodriguez, Xabier Insausti, Fermin Rodriguez Lalanne, Javier Velasco, Jesus Gutierrez-Gutierrez
Summary: This study presents a low-complexity algorithm that provides a fault-tolerant power transmission network by minimizing the redundancy of wiring. The proposed network configuration achieves significant cable length savings compared to traditional decentralized network configurations.
IEEE TRANSACTIONS ON TRANSPORTATION ELECTRIFICATION
(2022)
Article
Computer Science, Theory & Methods
Niloofar Bayat, Kunal Mahajan, Sam Denton, Vishal Misra, Dan Rubenstein
Summary: This study explores a new approach to identifying power outages through intelligent monitoring of IP address availability, using residential Internet connections as indicators and constructing dynamic scoring metrics for reliability evaluation. By tracking power outages with this method, a detection accuracy of 90% was achieved, although some false alarms and missed events were observed.
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE
(2021)
Article
Computer Science, Information Systems
Chinmaya Kumar Dehury, Prasan Kumar Sahoo, Bharadwaj Veeravalli
Summary: The deployed cloud applications have interconnected dependent components, and multiple identical components are run concurrently to handle failures and provide uninterrupted service. This introduces resource overhead for the cloud service provider. To address this, a novel fault tolerant strategy based on the significance level of each component is developed. A Markov Decision Process model is presented to determine the number of replicas based on component ranking. Simulation results show that the proposed algorithm reduces the required number of virtual and physical machines compared to similar algorithms.
IEEE TRANSACTIONS ON CLOUD COMPUTING
(2023)
Article
Computer Science, Information Systems
Yasmina Bouizem, Djawida Dib, Nikos Parlavantzas, Christine Morin
Summary: Function-as-a-Service (FaaS) is a popular programming model supported by major cloud providers for building serverless applications. The main challenge for FaaS providers is providing fault tolerance for the deployed applications. This paper proposes integrating a Request Replication mechanism in FaaS platforms and compares its performance with the retry approach and an Active-Standby approach in terms of availability, performance, and resource consumption under different failure scenarios.
JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS
(2023)
Article
Chemistry, Analytical
Alberto Ballesteros, Manuel Barranco, Julian Proenza, Luis Almeida, Francisco Pozo, Pere Palmer-Rodriguez
Summary: Distributed Embedded Systems (DESs) carrying out critical tasks need to be highly reliable, real-time, and adaptive. Developing dynamic fault tolerance mechanisms is interesting for improving dependability. This paper presents the Dynamic Fault Tolerance for Flexible Time-Triggered Ethernet (DFT4FTT) as a self-reconfigurable infrastructure for implementing highly reliable and adaptive DES. The design of its hardware and software architecture and the main fault tolerance mechanisms are described.
Article
Computer Science, Information Systems
Mohammadreza Amel Solouki, Jacopo Sini, Massimo Violante
Summary: This paper introduces hardware hardening strategies and software-implemented hardware fault tolerance methods for embedded systems to improve system reliability. The proposed approach applies software-based control flow error detection techniques in the C programming language to prevent Control Flow Errors before compiling the application code. Two established techniques were compared, showing the effectiveness of the method in addressing common hardware failures in embedded systems.
Article
Automation & Control Systems
Hao Chen, Shuai Xu, Sihang Cui
Summary: A novel reliability evaluation scheme for quantitative reliability calculation of power converter in switched reluctance motor system is proposed, combining electrical model, thermal circuit model, and dynamical Markov model. The proposed failure criterion simplifies the model, decreases the number of states, and avoids result divergence, emphasizing the interconnected relationship between the three models. The validity of the proposed reliability evaluation scheme is verified through simulations and experiments.
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS
(2021)
Article
Computer Science, Information Systems
Sepideh Safari, Mohsen Ansari, Heba Khdr, Pourya Gohari-Nazari, Sina Yari-Karin, Amir Yeganeh-Khaksar, Shaahin Hessabi, Alireza Ejlali, Jorg Henkel
Summary: This paper provides an in-depth survey of task mapping/scheduling policies for fault-tolerance real-time embedded systems. It reviews and classifies these policies according to their goals and constraints, considering factors such as application models and hardware models. The survey analyzes the achievements and shortcomings of existing approaches and highlights the most promising ones.
Article
Computer Science, Theory & Methods
Sepideh Safari, Heba Khdr, Pourya Gohari-Nazari, Mohsen Ansari, Shaahin Hessabi, Joerg Henkel
Summary: This paper presents a thermal-aware scheduling scheme named TherMa-MiCs for fault-tolerant MCSs. The scheme ensures the temperature constraint while satisfying the timing constraints of high-criticality tasks and maximizing the QoS of low-criticality tasks.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
(2022)
Article
Computer Science, Information Systems
Roozbeh Siyadatzadeh, Fatemeh Mehrafrooz, Mohsen Ansari, Bardia Safaei, Muhammad Shafique, Jorg Henkel, Alireza Ejlali
Summary: Due to the real-time requirements in IoT applications, fog computing has emerged to overcome the constraints of cloud computing. However, the reliability of executing real-time tasks in fog computing is a significant challenge. This article proposes a novel task assignment strategy based on machine learning to improve the reliability of fog-based IoT systems. The proposed technique reduces task dropping rate by up to 84% and increases system reliability by nearly 72% compared to state-of-the-art methods.
IEEE INTERNET OF THINGS JOURNAL
(2023)
Proceedings Paper
Computer Science, Hardware & Architecture
Mohsen Ansari, Sepideh Safari, Amir Yeganeh-Khaksar, Roozbeh Siyadatzadeh, Pourya Gohari-Nazari, Heba Khdr, Muhammad Shafique, Joerg Henkel, Alireza Ejlali
Summary: In this paper, an aging-aware task replication method called ATLAS is proposed for multicore safety-critical systems. The method updates the required number of replicas for each task to meet the reliability target, and reduces the temperature to decelerate aging effects. Experimental results demonstrate the effectiveness of the proposed method in improving schedulability by 16.1% on average and reducing the temperature by 7.4 degrees C.
2023 IEEE 29TH REAL-TIME AND EMBEDDED TECHNOLOGY AND APPLICATIONS SYMPOSIUM, RTAS
(2023)
Article
Computer Science, Hardware & Architecture
Sina Yari-Karin, Roozbeh Siyadatzadeh, Mohsen Ansari, Alireza Ejlali
Summary: In addition to real-time constraint, power/energy efficiency and high reliability are important objectives for real-time embedded systems. Heterogeneous multicore systems have been considered as a suitable solution for achieving joint power/energy efficiency and high reliability. However, power/energy and reliability are conflicting requirements due to fault-tolerance techniques. The proposed method in this article uses a passive primary/backup technique to maintain system reliability while reducing power/energy consumption in heterogeneous multicore systems. It maps primary and backup tasks in a mixed manner to take advantage of different core types and schedules backup tasks after primary tasks to avoid overlap. Experimental results demonstrate the power efficiency and effectiveness of our proposed method in terms of scheduling.
IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING
(2023)
Article
Computer Science, Information Systems
Mohsen Ansari, Sepideh Safari, Sina Yari-Karin, Pourya Gohari-Nazari, Heba Khdr, Muhammad Shafique, Joerg Henkel, Alireza Ejlali
Summary: This paper proposes a thermal-aware standby-sparing technique that aims to maximize the Quality of Service (QoS) of real-time tasks while meeting power constraints and preventing thermal emergencies. The technique tolerates faults and reduces power consumption by removing overlaps between main and backup tasks. By employing a heterogeneous platform, the main tasks are executed on high-performance cores while the backup tasks are executed on low-power cores. Experiments show significant improvements in QoS, power consumption, and temperature reduction.
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING
(2022)
Article
Computer Science, Hardware & Architecture
Mohsen Shekarisaz, Ali Hoseinghorban, Mostafa Bazzaz, Mohammad Salehi, Alireza Ejlali
Summary: This paper addresses the issue of energy consumption in the memory subsystem of edge devices by proposing a task mapping, scheduling, and dynamic allocation scheme based on hybrid Scratchpad Memories (SPM). By formulating the hybrid SPM allocation using integer linear programming, the energy consumption of the memory subsystem is minimized. Experimental results demonstrate that the proposed scheme outperforms the existing heuristic dynamic data allocation algorithm, achieving up to 34% energy savings in the memory subsystem.
IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING
(2022)
Article
Computer Science, Theory & Methods
Mohsen Ansari, Sepideh Safari, Heba Khdr, Pourya Gohari-Nazari, Joerg Henkel, Alireza Ejlali, Shaahin Hessabi
Summary: This article introduces a peak-power-aware checkpointing (PPAC) technique that tolerates faults and meets power constraints in hard real-time embedded systems. By adjusting the timing of checkpoints and utilizing the available slack times on the cores, the technique reduces peak power and saves energy.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
(2022)
Article
Computer Science, Information Systems
Fahimeh Bahrami, Behnaz Ranjbar, Nezam Rohbani, Alireza Ejlali
Summary: Embedded Systems have transitioned from special-purpose hardware to commodity hardware, and have tended towards Mixed-Criticality implementations. Multi-cores bring new challenges due to Process Variation and affect the predictability of Embedded Systems. This work explores variation-aware techniques to improve reliability, scheduling, and energy saving in Mixed-Criticality systems.
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING
(2022)
Article
Computer Science, Information Systems
Amir Yeganeh-Khaksar, Mohsen Ansari, Alireza Ejlali
Summary: This article proposes a method for mapping and scheduling periodic soft real-time tasks in multicore embedded systems to achieve a given reliability target while keeping the total power consumption under the chip TDP. Experimental results show that the proposed method can significantly reduce peak power consumption.
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING
(2022)