Scheduling in large scale computing clusters is critical to job performance and resource utilization. As the cluster size grows to thousands of machines and scheduling needs become complex and varied, scheduling in cloud-scale clusters presents unique challenges. To encourage the development of innovative schedulers, there is a need for an experimental framework to analyze scheduling performance over large clusters, using relatively modest resources. In this thesis, we present an experimental scheduler testbed to study job scheduling in emulated cloud-scale clusters. We show that the performance of the scheduler in an emulated cluster models closely the same in a real cluster of the same size. We use the testbed to evaluate the monolithic scheduler architecture, a popular scheduling architecture, in a 6000 node emulated cluster over realistic workload. We conclude that scheduling algorithms should embrace randomness in order to beat resource contention. We infer that scheduling in the monolithic architecture is a network I/O intensive process. We calculate the optimal value of design parameters for the monolithic architecture for Google workload.Hadoop YARN is a popular open-source cluster management framework which can be seen as an implementation of the monolithic scheduler architecture. We evaluate the three default scheduling policies in Hadoop YARN: Capacity, Fair and Fifo, over realistic workload. Based on our experiments, we observe that Fifo scheduling results in unbalanced load across cluster machines and is not suitable for enterprise clusters. We study the trade-offs exploited by Capacity and Fair scheduler: while the Fair scheduler offers less scheduling delay by avoiding head-of-the-line blocking problem, it may drop applications in case the load increases. On the other hand, the Capacity scheduler does not drop any application but errs on the side of higher scheduling delay.
【 预 览 】
附件列表
Files
Size
Format
View
An experimental study of monolithic scheduler architecture in cloud computing systems