Journal
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE
Volume 32, Issue -, Pages 324-337Publisher
ELSEVIER SCIENCE BV
DOI: 10.1016/j.future.2013.08.007
Keywords
Large graph processing; Parallel processing; Big data; Cloud computing; Collective classification; Shortest path; Networked data; Bulk Synchronous Parallel; MapReduce
Categories
Funding
- Polish National Center of Science
- Institute of Informatics of Wroclaw University of Technology
- European Commission [316097]
Ask authors/readers for more resources
More and more large data collections are gathered worldwide in various IT systems. Many of them possess a networked nature and need to be processed and analysed as graph structures. Due to their size they very often require the usage of a parallel paradigm for efficient computation. Three parallel techniques have been compared in the paper: MapReduce, its map-side join extension and Bulk Synchronous Parallel (BSP). They are implemented for two different graph problems: calculation of single source shortest paths (SSSP) and collective classification of graph nodes by means of relational influence propagation (RIP). The methods and algorithms are applied to several network datasets differing in size and structural profile, originating from three domains: telecommunication, multimedia and microblog. The results revealed that iterative graph processing with the BSP implementation always and significantly, even up to 10 times outperforms MapReduce, especially for algorithms with many iterations and sparse communication. The extension of MapReduce based on map-side join is usually characterized by better efficiency compared to its origin, although not as much as BSP. Nevertheless, MapReduce still remains a good alternative for enormous networks, whose data structures do not fit in local memories. (C) 2013 The Authors. Published by Elsevier B.V. All rights reserved,
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available