CIS Seminar Series
Advanced Schedulers For Next-Generation HPC Systems- Stephen Herbein, PhD Student, CIS Department
High performance computing (HPC) is undergoing many changes at both the system and workload levels. At the system level, data movement is becoming more costly in relation to computation and HPC centers are becoming increasingly power-constrained. In an effort to adapt to these trends, HPC systems are including new resources such as burst buffers and GPUs which makes the resource set larger and more diverse. At the workload level, new ensemble workloads, such as uncertainty quantification (UQ), are emerging within HPC, driving up the workload scale in terms of the number of jobs. Existing HPC scheduling models are unable to adapt to these changes, leading to degraded system efficiency and application performance.
In this thesis, we claim that new schedulers are needed to overcome the above-mentioned challenges and efficiently manage the next-generation of HPC systems. To this end we design, implement, and evaluate three key transformations to the existing scheduling models. First, we integrate I/O-awareness into existing scheduling policies and demonstrate that I/O-aware scheduling increase the efficiency of burst buffer-enabled HPC systems. Second, we expand our I/O-aware scheduler to incorporate the accurate knowledge of application I/O utilization patterns provided by machine learning models. Third, we design a prototype scheduler based on the fully hierarchical scheduling model and show that it reduces scheduler overhead and increases job throughput on synthetic and real-world ensemble workloads, such as UQ. Our work is the first step towards a new generation of scheduling models for HPC.
Wednesday, May 16, 2018 at 2:00pm
Smith Hall, Room 102A
Smith Hall, University of Delaware, Newark, DE 19716, USA