SMT/HT in HPC

Should HPC workloads run with Simultaneous Multi-Threading/ Hyperthreading enabled or not? Fight!

SMT/HT in HPC

Whether Simultaneous Multi-Threading (SMT)/ Hyperthreading (HT) should be enabled for HPC workloads is always a bit of a contentious topic and a good way to start a conversation (if not a brawl) at event with a concentration of HPC folk.

Recently, I was mildly surprised to learn that Mare Nostrum 5, the supercomputer at the Barcelona Supercomputing Centre, runs with Hyperthreading enabled. It always seemed taboo to even suggest the idea in classical HPC circles.

"My code is so optimised there’s no way it could benefit from enabling SMT/HT" is the usual retort from proud RSEs, quants and risk system developers.

About two years ago when we tested several hundred cloud VMs using COREx and Passmark we found that didn’t hold up

Hyperthreading in HPC: On or Off?
This article discusses the use of simultaneous multi-threading (hyper-threading) in HPC, with a particular emphasis on financial risk systems and the COREx benchmark. It looks into the SMT status of cloud virtual machines, the ability to enable or disable and the cost implications of doing so.

Much like our experience with real risk systems, performance with SMT/HT enabled was generally better than with it disabled.

Our recent testing with AMD’s latest CPUs only served to confirm this.

And this is with a workload that will happily pin CPU usage to 100% utilisation and fry your coolers.

Not only that but you’ll also save a few polar bears and cents by reducing your electricity bill. Sure, we found the enabling SMT increased the power demand,

but it also reduced the total energy used for the same amount of work.

So, it’s a slam dunk, right? Just enable SMT everywhere and move on? Not so fast.

The same work that produced the above also showed that enabling SMT/HT also increased the time to result. Sure, it increases the total throughput too, but if what you care about more is how long till you get an answer, then leaving SMT/HT disabled will serve you better.

Same workload, same hardware. Different answer.

And we haven't even considered software licensing implications yet.

There is only one way to know if you should be running with SMT/HT enabled or not. Ask me. 

I jest of course, You need to test it, and you need to know what outcome you’re looking for.

The whole paper this extract is based on can be found here:

Optimising Financial Services HPC Workloads on AMD EPYC 4th and 5th Generation Processors
We take a look at optimising financial risk analytics on modern AMD CPUs. Turns out we had quite a lot to say. About 5500 words in fact!