Date of this Version
The growing expenses of power in data centers as compared to the operation costs has been a concern for the past several decades. It has been predicted that without an intervention, the energy cost will soon outgrow the infrastructure and operation cost. Therefore, it is of great importance to make data center clusters more energy efficient which is critical for avoiding system overheating and failures. In addition, energy inefficiency causes not only the loss of capital but also environmental pollution. Various Power Management(PM) strategies have been developed over the years to make system more energy efficient and to counteract the sharply rising cost of electricity. However, it is still a challenge to make the system both power efficient and computation efficient due to many underlying system constraints.
In this thesis, we investigate the Power Management technique in heterogeneous MapReduce clusters while also maintaining the required system QoS (Quality of Service). For a cluster that supports MapReduce jobs, it is necessary to develop a PM technique that also considers the data availability. We develop our PM strategy by exploiting the fact that the servers in the system are underutilized most of the time. Hence, we first develop a model of our testbed and study how the server utilization levels affect the power consumption and the system throughput. With the established models, we form and solve the power optimization problem for heterogeneous MadReduce clusters where we control the server utilization levels intelligently to minimize the total power consumption.
We have conducted simulations and shown the power savings achieved using our PM technique. Then we validate some of our simulation results by running experiments in a real testbed. Our simulation and experimental data have shown that our PM strategy works well for heterogeneous MapReduce clusters which consists of different power efficient and inefficient servers.
Adviser: Ying Lu