Optimising Financial Services HPC Workloads on AMD EPYC 4th and 5th Generation Processors
We take a look at optimising financial risk analytics on modern AMD CPUs. Turns out we had quite a lot to say. About 5500 words in fact!

Running a financial risk system at scale? You probably want to read this.
I’ve been talking about this and dropping little hints about some of our findings for months, but we’re finally ready to release the whole paper.
A massive thank you to everyone who helped review, provided feedback or otherwise helped in this work. I’d mention you all by name but then I’d probably get in trouble!
I know what the very next question will be and yes, we are attempting to repeat this exercise for other CPU manufacturers for both x86 and other ISAs. Watch this space.
The abstract is below and you can access the whole paper as a PDF here:
Optimising Financial Services HPC Workloads on AMD EPYC 4thand 5th Generation Processors
Abstract
This paper provides an analysis of HPC workloads in financial services on AMD 4th generation (previously code-named Genoa) and 5th generation (previously code-named Turin) CPUs using the COREx benchmark. It covers:
· A performance comparison of enabling or disabling SMT
· Power utilisation comparisons across generations and SMT states
· COREx results across different generations of CPU and bare metal vs cloud
· Performance optimisation based on process to logical CPU ratios (slot counts)
· Optimisation for interactive (real time) versus end of day risk
Zen 4 to Zen 5 shows a per core increase in performance of around 22 to 25% for financial analytics workloads. Whilst some of this may be attributable to an increase in the all-core boosted clock speed between the CPUs tested, much of it will be due to other generational improvements.
Further, a 63% increase in performance in observed between the two parts tested. Only part of this (33%) may be attributed to the increased core count between parts (that are not a direct replacement for one another between generations). The remaining is due to improvements in the L3 cache, memory bandwidth and other process improvements. These factors become increasingly important as core density increases and are well-illustrated by the results of this work.
The results also illustrate outcomes that may be counterintuitive to some with specific cloud virtual machines exhibiting higher COREx scores than comparable bare metal systems using the same Zen 4 or Zen 5 cores. This was observed across both Azure (HBv4 and Fasv6) and Google (C3D, C4D and H4D) virtual machines scoring higher on COREx than the bare metal systems we tested.
Finally, this work provides insight into performance and power optimisation of financial services risk workloads on modern CPU architectures. Some results shown here, while consistent with previous benchmarking exercises using COREx, may be counterintuitive to many. This includes enabling SMT for the best throughput performance but conversely deliberate underutilisation of the CPU (and even disabling SMT) for interactive risk use cases.