Designing a Hadoop MapReduce Performance Model using Micro Benchmarking Approach

Proceedings of The 2nd International Conference on Innovation in Computer Science and Artificial Intelligence

Year: 2019

DOI:

[Fulltext PDF]

Designing a Hadoop MapReduce Performance Model using Micro Benchmarking Approach

Manal Alalawi and Herbert Daly

 

ABSTRACT: 

Hadoop MapReduce platforms are currently used at an extensive rate to deal with complex data analysis of large size data sets. In MapReduce environments, parallel and distributed processing of big data is done with high energy requirements. Many of the contemporary organisations are looking to reduce the energy requirements of Hadoop MapReduce by maintaining the same performance levels. In this paper a platform performance model is proposed specifically for Hadoop MapReduce environments in order to improve the energy efficiency of these applications in automatic manner. Unlike the existing performance models, the proposed performance model related the different number of processed data and durations of executed phases that are accomplished through collected measurements from executed sets of micro benchmarking. The resource distribution strategy of this performance model helps to estimate the job completion time on the basis of resource distribution. Mathematical modelling and experiments showed the accuracy of this performance model is improving the energy efficiency of Hadoop MapReduce environments.

Keywords: Hadoop, MapReduce, energy efficiency, performance model, MapReduce phases.