4.5 Article

An optimized MapReduce workflow scheduling algorithm for heterogeneous computing

Journal

JOURNAL OF SUPERCOMPUTING
Volume 72, Issue 6, Pages 2059-2079

Publisher

SPRINGER
DOI: 10.1007/s11227-014-1335-2

Keywords

Hadoop; Heterogeneous cluster; MapReduce; Scheduling; Workflow

Funding

  1. Key Program of National Natural Science Foundation of China [61133005, 61432005]
  2. National Natural Science Foundation of China [61103047, 61370095]

Ask authors/readers for more resources

The MapReduce framework is considered to be an effective resolution for huge and parallel data processing. This paper treats a massive data processing workflow as a DAG graph consisting of MapReduce jobs. In a heterogeneous computing environment, the computation speed can be different even on the same slot depending on various jobs. For this problem, this paper proposes an optimized MapReduce workflow scheduling algorithm. This algorithm comprises a job prioritizing phase and a task assignment phase. First, the jobs can be classified as I/O-intensive and computing-intensive, and the priorities of all jobs are computed according to their corresponding types. Then, the suitable slots are allocated for each block, and the MapReduce tasks in the workflow are scheduled with respect to data locality. The experimental results show that the optimized MapReduce workflow scheduling algorithm can improve the performance of task scheduling and the rationality of resources allocation in heterogeneous computing.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available