Quantum #89
Issue #89 of the weekly HPC newsletter by HMx Labs. Some HPC updates from AWS re:Invent, agentic AI for HPC and tackling numerical stability to reduce HPC costs.
Whilst most of the HPC specific announcements from AWS came a couple of weeks ago during SC25 we did have some interesting things released at re:Invent last week too. AWS have cooked up another VM based on AMD’s 5th generation EPYC CPUs but this one is running at a 5GHz clock speed. I’m quite keen to see how this compares to the regular M8a family. I think that might make for quite an interesting data point to add to our paper looking at 4th and 5th generation EPYC processors.
The bigger CPU news though was the release of Graviton5 as a 192 core socket, twice as many as in the Graviton4. I wonder if we’ll see a HPC specific VM with graviton this time. A hpc9g if you will. Full roundup of what else was released in the HPC Cloud Release Notes linked below

By happy coincidence, not only did Microsoft release a blog about automating HPC with Copilot agents but I also came across another paper where researchers at a national lab have done some work in this area but using Gemini instead. Note that this is still work in progress.
Both are use cases that I hadn’t considered and quite honestly, I would have just put them down to another AI hype pump it wasn’t for the fact that the second paper is written by some very credible people I know personally. So maybe we don’t need to make HPC easy for everyone. It’s fairly clear we’ve been failing at that for long enough. Maybe we just need to point an AI at the complicated stuff and let it deal with it for us.
In The News
AWS re:Invent and all the other updates this week that are HPC and cloud related.

And for a little more detail around that new Graviton 5 CPU, Next Platform has a pretty good write up. We’re waiting on our preview access to be approved (or not) but given the usual NDA requirements around preview access we won’t be able to share much more till it becomes GA anyway 😦

Glenn Lockwood’s SC25 recap is available now and as per previous years well worth a read.

From HMx Labs
Both shorts this week are based on the talk I gave last month at Quant Minds around reducing costs for CPU compute workloads by having code that is portable across CPU manufacturers and architectures… while avoiding the ghost of numerical instability
Numerical stability and an very brief introduction of where to find it

and the benefits of doing so

Know someone else who might like to read this newsletter? Forward this on to them or even better, ask them to sign up here: https://cloudhpc.news


