The Jagged Intelligence of AI
A non-update but still an update on HAL, our vibe coded HPC scheduler experiment
AI can write code the same way that monkeys on a typewriter can produce the works of Shakespeare or the Monte Carlo method can calculate the value of Pi.
Ok that’s a little (maybe a lot) unfair. The AI generated output is not random, it’s the most probable output based on its training set but, and this is an important “but”, the validation required of that output is the same. In fact, the more complex the work or the more it deviates from the median of the training data of the model the truer this statement becomes.
And herein lies the problem.
It was trivially easy to generate an applicant tracking system or an company directory tool that does some basic audit tracking for internal use. Those are just CRUD operations with a pretty UI. The time taken to (create and) validate they operate correctly was less than the time it would have taken to learn how to use a SaaS offering.
Sadly, that translates very poorly to creating a new HPC scheduler / workload manager. Very poorly. Validating the behaviour of such a system is not trivial. That validation itself needs code and if that code too is AI generated, you’re in a circular loop of doom.
This discrepancy is hard to talk about in a meaningful way. It would be very incorrect of me to just say AI can create CRUD apps but is useless in HPC. It would be a nice soundbite and provide a clear easy to follow bit of guidance on how to use a complex and novel bit of technology. But it would be wrong.
The answer is instead rather more frustrating. The capabilities are highly varied and for some reason also seem to vary through time for very similar tasks. I think the common phrase for this is jagged intelligence. The slightly shocking part is just how jagged it is, even within a small domain.
Ironically, part of the problem is also the very thing that we struggle with when trying to use natural language to program computers – the imprecision of language. We call anyone that works with code a programmer or software engineer or some other variant that is often used interchangeably. You don’t call yourself a carpenter if you put up an Ikea bookshelf. Yet we have exactly that level of disparity between what software engineers do. I’ve talked about this at more length in the past (even before AI) but I think it’s even more relevant now for AI adoption because not only does it differentiate between the capabilities of different people it also applies to different tasks the same person does as part of their job.
This has been a bit of a rambling update on the state of HAL and unfortunately the Git history won’t tell you a much better story because I got tired of the copy pasta dance of putting the AI history into Git and hitting token limits. I’ll try again and do better.
