Article
Computer Science, Information Systems
Ricardo Mendes, Tiago Oliveira, Vinicius Cogo, Nuno Neves, Alysson Bessani
Summary: CHARON is a cloud-backed storage system that can store and share big data securely, reliably, and efficiently, without requiring trust on any single entity. It efficiently deals with large files and uses a novel Byzantine-resilient data-centric leasing protocol to avoid write-write conflicts. Evaluation shows up to 2.5x better performance compared to other cloud-backed solutions.
IEEE TRANSACTIONS ON CLOUD COMPUTING
(2021)
Article
Computer Science, Information Systems
Frederico Cerveira, Raul Barbosa, Henrique Madeira, Filipe Araujo
Summary: Virtualized servers are widely used in cloud computing environments to host online applications and provide elastic computing resources. However, the presence of soft errors in large-scale servers can lead to various failure modes, with hang failures being the most common. A recovery mechanism using online testing is developed to address these hang failures and ensure server uptime.
IEEE TRANSACTIONS ON CLOUD COMPUTING
(2022)
Article
Chemistry, Analytical
Muhammad Shaukat, Waleed Alasmary, Eisa Alanazi, Junaid Shuja, Sajjad A. Madani, Ching-Hsien Hsu
Summary: The objective of this study is to improve the energy efficiency of data center networks while maintaining fault tolerance. The proposed Energy-Aware Fault-Tolerant (EAFT) approach is compared and analyzed with other energy efficient resource scheduling techniques. The study shows that there is a tradeoff between energy efficiency and fault tolerance, and it is possible to achieve a balance between the two.
Article
Mathematical & Computational Biology
Fawza A. Al-Zumia, Yuan Tian, Mznah Al-Rodhaan
Summary: Mobile health networks (MHNWs) provide instant medical health care and remote health monitoring for patients, requiring quick collection, processing and analysis of a vast amount of health data. The main challenge lies in limited computational storage resources, leading to the need to outsource health data to the cloud, which raises security and privacy concerns. A novel design for a private and fault-tolerant cloud-based data aggregation scheme is proposed, using differential privacy for privatization and improving fault tolerance capabilities. The scheme is efficient, reliable, and secure, with minimized aggregation error compared to related schemes.
MATHEMATICAL BIOSCIENCES AND ENGINEERING
(2021)
Article
Computer Science, Artificial Intelligence
Xiaodong Qi, Zhao Zhang, Cheqing Jin, Aoying Zhou
Summary: The article introduces a novel storage engine called BFT-Store, which enhances storage scalability by integrating erasure coding with Byzantine Fault Tolerance (BFT) consensus protocol. BFT-Store reduces the storage consumption per block to O(1) for the first time, increasing overall storage capability, and includes efficient online re-encoding protocol and hybrid replication scheme to enhance reading performance. Analysis and experimental results demonstrate the scalability, availability, and efficiency of BFT-Store in an open-source permissioned blockchain Tendermint.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2021)
Article
Computer Science, Artificial Intelligence
Zhengyi Du, Xiongtao Pang, Haifeng Qian
Summary: This paper proposes a better blockchain storage scheme named PartitionChain that addresses three problems in system scalability, while maintaining the merits of BFT-Store. The scheme reduces storage costs by using aggregate signatures as proof of encoded data and eliminates the need for a trusted third party. It also significantly lowers computational complexity for retrieving data and reduces transmitted data for recovering each block, allowing for dynamic network adaptation and improved efficiency and scalability compared to BFT-Store.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
(2023)
Article
Computer Science, Information Systems
Jarallah Alqahtani, Hassan H. Sinky, Bechir Hamdaoui
Summary: The multi-tenancy concept in cloud data center networks has paved the way for innovations in infrastructure such as network virtualization. However, traditional IP multicast routing is not suitable for DC networks, leading to the development of state-of-the-art DC multicast routing approaches that aim to optimize packet information for scalability.
Article
Computer Science, Information Systems
Bara Abusalah, Derek Schatzlein, Julian James Stephen, Masoud Saeida Ardekani, Patrick Eugster
Summary: This article proposes the paradigm of dependable resources, which provides generic fault tolerance mechanisms by offering fault tolerance support at the level of resource management systems. Through the demonstration of Guardian, the benefits of this concept are shown, improving completion time for big data processing frameworks in the presence of failures while maintaining low overhead.
IEEE TRANSACTIONS ON CLOUD COMPUTING
(2022)
Article
Computer Science, Information Systems
Priti Kumari, Parmeet Kaur
Summary: Cloud computing has transformed the delivery model of information technology from product to service, but its performance is hindered by scale-related vulnerabilities, making fault tolerance a critical requirement for achieving high performance. This comprehensive overview of fault tolerance in cloud computing presents solutions and identifies future research directions.
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES
(2021)
Article
Computer Science, Artificial Intelligence
Mani Alaei, Reihaneh Khorsand, Mohammadreza Ramezanpour
Summary: The research aims to develop an adaptive fault detection strategy based on the Improved Differential Evolution algorithm in cloud computing to minimize energy consumption, makespan, total cost, and tolerate faults while scheduling scientific workflows. The proposed method utilizes an adaptive network-based fuzzy inference system prediction model to proactively control resource load fluctuation and applies a reactive fault tolerance technique for processor failures. Experimental results showed significant improvements in scheduling performance, fault tolerance, makespan, energy consumption, task fault ratio, and total cost compared to existing techniques.
APPLIED SOFT COMPUTING
(2021)
Article
Computer Science, Theory & Methods
Junxu Xia, Geyao Cheng, Lailong Luo, Deke Guo, Pin Lv, Bowen Sun
Summary: This article proposes a deduplication-enabled storage system called MEAN that uses unreliable resources at the network edge. MEAN places similar files together for better deduplication and maintains replicas of popular files for higher reliability. The author formulates the problem, proves its NP-hardness, and provides efficient heuristics based on similarity-aware hierarchical clustering. Performance evaluation using a real-world dataset shows that MEAN improves the file hit ratio by 77% and reduces file retrieval delay by up to 71% compared to existing methods.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
(2023)
Article
Computer Science, Theory & Methods
J. Armando Barron-Lugo, J. L. Gonzalez-Compean, Ivan Lopez-Arevalo, Jesus Carretero, Jose L. Martinez-Rodriguez
Summary: This paper presents Xel, a cloud-agnostic data platform designed to support the building of high-availability data science services for data-driven decision-making. Xel includes a framework for end-users to select analytic and machine learning tools, a recursive ETL processing model, an orchestration model for managing data delivery, and a data decentralized model to mask service unavailability. Real users have successfully created various data science services using Xel, and the platform has been proven effective in enabling users to create pipelines without programming and automatically masking unavailability of cloud resources and data.
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE
(2023)
Review
Computer Science, Information Systems
Muhammad Asim Shahid, Noman Islam, Muhammad Mansoor Alam, M. S. Mazliham, Shahrulniza Musa
Summary: This research article provides a detailed survey of emerging fault tolerance methods for Cloud Computing, categorizing them into Reactive Methods, Proactive Methods, and Resilient Methods. Each category focuses on different approaches to deal with system faults, with Resilient Methods aiming to reduce recovery time from malfunctions by utilizing Machine Learning and Artificial Intelligence.
COMPUTER SCIENCE REVIEW
(2021)
Review
Computer Science, Hardware & Architecture
Shreshth Tuli, Fatemeh Mirhakimi, Samodha Pallewatta, Syed Zawad, Giuliano Casale, Bahman Javadi, Feng Yan, Rajkumar Buyya, Nicholas R. Jennings
Summary: In recent years, there has been a shift in computing paradigms towards decentralized systems like IoT, Edge, Fog, Cloud, and Serverless. This shift has been powered by the adoption of AI-driven autonomous systems for managing distributed computing resources. This survey explores the evolution of data-driven AI methods and their impact on computing systems, focusing on resource management and QoS optimization. It also discusses future research directions and the potential of AI-driven computing systems.
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS
(2023)
Article
Computer Science, Hardware & Architecture
Biswajeet Sethi, Sourav Kanti Addya, Jay Bhutada, Soumya K. K. Ghosh
Summary: Serverless computing is a new standard for cloud applications, but it neglects the importance of data. Existing serverless architectures are based on data shipping, which leads to high latency. This paper proposes an inter-region code shipping architecture that allows code to flow from computation side to data side, achieving faster latency in a serverless environment.
JOURNAL OF SUPERCOMPUTING
(2023)