HPC = High Performance Containers?
What place does containerisation have in the world of HPC?
Not too many years ago, I would have adamantly claimed that containerisation is the future. Containerise all the things! Including HPC. HPC = High Performance Containers.
Now? I’m not so sure.
But first let’s dispel a myth and break things down a little. First the myth, running an application inside a container is not inherently slower than running it directly on the machine.
Secondly, I still think containerisation as a way to package software, even for HPC, is valuable. What I’m less sure about, is its usage as a unit of either scheduling or orchestration.
Containers as a unit of orchestration make huge amounts of sense when running large numbers of diverse application sometimes with fairly small compute requirements. Microservice deployments for example. It’s not a coincidence that Kubernetes allows the assignment of milliCPUs of compute to resources.
When was the last time an HPC job needed milliCPUs though? HPC applications need to orchestrate capacity in the hundreds and thousands of CPUs. Why bother breaking that down to fractions of a CPU? Why not just deal in either machines or CPUs directly?
As for containers as a unit of scheduling, well I think perhaps it’s just too restrictive. The reality is that most HPC applications are rather old. Expecting them to adopt containerisation to use a new scheduler raises yet another hurdle. One which I’m not sure will really benefit the application either.
Am I wrong? Are containers still the universal panacea to all things they were poised to be ten years ago? Or just a distraction in the world of HPC?