Computer Science and Engineering, Department of


Date of this Version



2013 IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing. Pages: 1536 - 1544, DOI: 10.1109/HPCC.and.EUC.2013.216.


U.S. government work.


MapReduce has been widely used as a Big Data processing platform. As it gets popular, its scheduling becomes increasingly important. In particular, since many MapReduce applications require real-time data processing, scheduling realtime applications in MapReduce environments has become a significant problem. In this paper, we create a novel real-time scheduler for MapReduce, which overcomes the deficiencies of an existing scheduler. It avoids accepting jobs that will lead to deadline misses and improves the cluster utilization. We implement our scheduler in Hadoop system and experimental results show that our scheduler provides deadline guarantees for accepted jobs and achieves good cluster utilization.