4.7 Article

The Family of MapReduce and Large-Scale Data Processing Systems

Journal

ACM COMPUTING SURVEYS
Volume 46, Issue 1, Pages -

Publisher

ASSOC COMPUTING MACHINERY
DOI: 10.1145/2522968.2522979

Keywords

Design; Algorithms; Performance; MapReduce; big data; large-scale data processing

Funding

  1. Australian Government

Ask authors/readers for more resources

In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large-scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling, and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large-scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large-scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Cardiac & Cardiovascular Systems

Prognostic value of exercise capacity among patients with treated depression: The Henry Ford Exercise Testing (FIT) Project

Amjad M. Ahmed, Waqas T. Qureshi, Sherif Sakr, Michael J. Blaha, Clinton A. Brawner, Jonathan K. Ehrman, Steven J. Keteyian, Mouaz H. Al-Mallah

CLINICAL CARDIOLOGY (2018)

Review Peripheral Vascular Disease

Cardiorespiratory Fitness and Cardiovascular Disease Prevention: an Update

Mouaz H. Al-Mallah, Sherif Sakr, Ada Al-Qunaibet

CURRENT ATHEROSCLEROSIS REPORTS (2018)

Article Computer Science, Theory & Methods

A Differentiated Caching Mechanism to Enable Primary Storage Deduplication in Clouds

Huijun Wu, Chen Wang, Yinjin Fu, Sherif Sakr, Kai Lu, Liming Zhu

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS (2018)

Article Multidisciplinary Sciences

Using machine learning on cardiorespiratory fitness data for predicting hypertension: The Henry Ford Exercise Testing (FIT) Project

Sherif Sakr, Radwa Elshawi, Amjad Ahmed, Waqas T. Qureshi, Clinton Brawner, Steven Keteyian, Michael J. Blaha, Mouaz H. Al-Mallah

PLOS ONE (2018)

Article Computer Science, Theory & Methods

RDF Data Storage andQuery Processing Schemes: A Survey

Marcin Wylot, Manfred Hauswirth, Philippe Cudre-Mauroux, Sherif Sakr

ACM COMPUTING SURVEYS (2018)

Article Cardiac & Cardiovascular Systems

Predictors of in-hospital length of stay among cardiac patients: A machine learning approach

Tahani A. Daghistani, Radwa Elshawi, Sherif Sakr, Amjad M. Ahmed, Abdullah Al-Thwayee, Mouaz H. Al-Mallah

INTERNATIONAL JOURNAL OF CARDIOLOGY (2019)

Article Computer Science, Information Systems

Stream Processing Languages in the Big Data Era

Martin Hirzel, Guillaume Baudart, Angela Bonifati, Emanuele Della Valle, Sherif Sakr, Akrivi Vlachou

SIGMOD RECORD (2018)

Meeting Abstract Cardiac & Cardiovascular Systems

CARDIORESPIRATORY FITNESS AND INCIDENT STROKE TYPES: THE FIT (HENRY FORD EXERCISE TESTING) PROJECT

Mahmoud Sobhi Al Rifai, Amjad Ahmed, Michael Blaha, Fatimah Almasoudi, Sherif Sakr, Waqas Qureshi, Clinton Brawner, Jonathan Ehrman, Steven Keteyian, Mouaz Al-Mallah

JOURNAL OF THE AMERICAN COLLEGE OF CARDIOLOGY (2019)

Editorial Material Computer Science, Theory & Methods

Editorial for Special issue of FGCS special issue on Benchmarking big data systems

Sherif Sakr, Albert Zomaya, Athanasios V. Vasilakos

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE (2019)

Meeting Abstract Cardiac & Cardiovascular Systems

Improvement in Maximal Exercise Capacity is Inversely Related to Incident Ischemic Stroke: Data From the Henry Ford Exercise Testing (FIT) Project

Jonathan K. Ehrman, Steven J. Keteyian, Waqas Qureshi, Sherif Sakr, Michelle C. Johansen, Michael J. Blaha, Mouaz H. Al-Mallah, Clinton A. Brawner

CIRCULATION (2019)

Proceedings Paper Computer Science, Theory & Methods

Calculation of Average Road Speed Based on Car-to-Car Messaging

Ahmed Ramzy, Ahmed Awad, Amr A. Kamel, Osman Hegazy, Sherif Sakr

2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP) (2019)

Article Computer Science, Information Systems

Dagstuhl Seminar on Big Stream Processing

Sherif Sakr, Tilmann Rabl, Martin Hirzel, Paris Carbone, Martin Strohbach

SIGMOD RECORD (2018)

Article Computer Science, Information Systems

Business Process Analytics and Big Data Systems: A Roadmap to Bridge the Gap

Sherif Sakr, Zakaria Maamar, Ahmed Awad, Boualem Benatallah, Wil M. P. Van Der Aalst

IEEE ACCESS (2018)

Article Computer Science, Information Systems

HDM: A Composable Framework for Big Data Processing

Dongyao Wu, Liming Zhu, Qinghua Lu, Sherif Sakr

IEEE TRANSACTIONS ON BIG DATA (2018)

Review Computer Science, Artificial Intelligence

Big Data Systems Meet Machine Learning Challenges: Towards Big Data Science as a Service

Radwa Elshawi, Sherif Sakr, Domenico Talia, Paolo Trunfio

BIG DATA RESEARCH (2018)

No Data Available