HPC Cloud Updates WE 24 Aug 2025
Updates to AWS, Azure & GCP in the last week relevant for HPC practitioners. New Intel VM family on AWS and how to check your GPU VMs actually work before doing any real work with them.

AWS
If you’re running HPC jobs on batch this change to select default instances to run on may be handy

We finally have some new Intel based VMs from AWS. r8i and r8i-flex are now generally available

New Instances: r7g in Cape Town, i7i in Frankfurt, London, Malalysia, Sydney and Tokyo
Azure
This is in relation to running AI workloads but the idea should translate to pretty much any kind of GPU workload either by using an alternative more appropriate benchmark or by assuming that the Llama workload will identify any problems that may arise with your own workload.
What I find rather telling though is the tacit recognition of the level of GPU failures being seen. Whilst this is no doubt useful for end users, shouldn’t companies renting GPUs being making sure they’re fit for purpose themselves before charging people for them? This isn’t a slight on Azure but everyone renting GPUs in the cloud.
Azure NetApp Files now has access logs
v2 billing model for Azure Files SSD premium now GA
Need confidential VMs? Private preview of DCesv6 and ECesv6 now available
Google Cloud
Not HPC but still interesting from a large scale compute cost perspective
