To Cache or Not To Cache
A look at the role of in memory caches in HPC
Technology has a wonderfully circular nature sometimes.
Around 10 or so years ago, working on financial risk systems, I was busy implementing in memory caches to feed the HPC workers. Most risk systems at the time had relied on either a basic file share or a database for this and moving to an cache was a huge jump in performance.
More recently, when working clients to migrate their HPC workload to the cloud, we’ve come across risk systems that hadn’t received the same level of investment over the last decade. They still ran using file systems as the input source for workers. We didn’t add an in memory cache. Why bother. There was more than one filesystem option available that exceeded the demands of the HPC application. The advent of both better hardware (SSDs) and faster distributed filesystems meant that it just isn’t necessary.
Conversely many of the systems that had been updated in the last decade to use caches were now significantly more complex to migrate to cloud! Hide this from your product managers. Letting them know that underinvestment pays off could set a bad precedent 😆
The article below by Behrad looks at the role caching plays in modern software though not HPC specific it is still an interesting read even for HPC practitioners.
To further complicate matters we’ve also seen what were once volatile in memory only caches like Redis (you should of course switch to ValKey) evolve to include persistence. Becoming in memory databases. Whilst database technologies such as Cassandra and Aerospike became so performant that the use of an in memory cache becomes questionable.
If you were designing an HPC system today, would you bother with a cache?