Cloud HPC: Do you still need a scheduler?
In an era of “infinite” compute capacity does the world of high performance computing still need a scheduler?
Back in the bad old days. When HPC ran on physical hardware. In your own building. When your code was written in C (the programming language not the Vitamin for the kids out there). When we had sign up sheets to book time on our favourite SGI workstation. Back then we needed a scheduler for our HPC workloads.
I guess those sign up sheets and having to schedule and share capacity on mainframes was still fresh enough in everyone’s minds that when it came to sharing a HPC cluster creating a scheduler was the natural solution. This was especially true if multiple users or applications were involved. Or workloads of varying priority.
Then came the internet. Planet scale computing. No one was about to queue to shop in an online storefront so…. We don’t schedule. We scale! And… cue the cloud. Cue pay per hour compute. Cue “infinite” capacity.
Who needs a scheduler right? Maybe.
HPC practitioners seem to be a little split into two camps on this. Those that think a scheduler is no longer required in the cloud era (yes, they tend to work for cloud providers) and those that believe scheduling is still necessary. Even on the cloud.
So! Does HPC in 2024 still require a scheduler? Discuss! Thoughts and opinions in the comments below.