Computing, School of

First Advisor

Ying Lu

Date of this Version

7-24-2011

Document Type

Dissertation

Citation

A dissertation presented to the faculty of the Graduate College at the University of Nebraska in partial fulfillment of requirements for the degree of Doctor of Philosophy

Major: Computer Science

Under the supervision of Professor Ying Lu

Lincoln, Nebraska, July 2011

Comments

Abstract

Cluster computing has become an important paradigm for solving large-scale problems. However, as the size of a cluster increases, so does the complexity of resource management and maintenance. Therefore, automated performance control and re- source management are expected to play critical roles in sustaining the evolution of cluster computing. The current cluster scheduling practice is similar in sophistication to early supercomputer batch scheduling algorithms, and no consideration is given to desired quality-of-service (QoS) attributes. To fully avail the power of computational clusters, new scheduling algorithms that provides high performance, QoS assurance, fault-tolerance, energy savings and streamlined management of the cluster resources needs to be developed.

The challenge, however, in developing real-time scheduling algorithms for cluster and grid computing is to support various types of applications. Broadly speaking, computational loads submitted to a cluster can be categorized into three types: sequential, modularly divisible and arbitrarily divisible. An arbitrarily divisible work- load model is a good approximation of many real-world applications, e.g., distributed search for a pattern in text, audio, graphical, and database files; distributed processing of big measurement data files; and many simulation problems. All elements in such an application often demand an identical type of processing, and relative to the huge total computation, the processing on each individual element is infinitesimally small. As such applications become a major type of cluster workloads and thus providing QoS to arbitrarily divisible loads becomes a significant problem for cluster-based research computing facilities.

The problem of providing performance guarantees to divisible load applications has not been studied systematically. The objective of this dissertation is to provide assured QoS performance to cluster and grid applications through the development of new real-time scheduling theory and algorithms, particularly, real-time divisible load scheduling algorithms for cluster computing. We develop and apply real-time scheduling algorithms for cluster computing, providing QoS for the gird and High Performance Computing (HPC) applications. In this dissertation, we address the aforementioned challenges by investigating and developing 1) real-time scheduling algorithms for divisible loads, 2) a real-time scheduling algorithm for divisible loads with advance resource reservation, 3) an efficient real-time divisible load scheduling algorithm for large clusters and 4) feedback-control based real-time divisible load scheduling algorithms that provide predictable performance in unpredictable environments.

Advisor: Ying Lu

Download

Included in

Computer Engineering Commons, Computer Sciences Commons

COinS

Computing, School of

School of Computing: Dissertations, Theses, and Student Research

Real-time Divisible Load Scheduling for Cluster Computing

First Advisor

Date of this Version

Document Type

Citation

Comments

Abstract

Included in

Search

Browse

Author Corner

Links

Computing, School of

School of Computing: Dissertations, Theses, and Student Research

Real-time Divisible Load Scheduling for Cluster Computing

Authors

First Advisor

Date of this Version

Document Type

Citation

Comments

Abstract

Included in

Share

Search

Browse

Author Corner

Links