An engineer familiar with a system would have a rough understanding of how information flows, what components exist and how they interact. That intimacy with the code is often a big part of how quickly they can zoom in to adapt software to changing requirements. Today, LLM-powered agents can grep through an unfamiliar codebase, cross-reference stacktraces, suggest a fix and open a pull request. It can build context faster than a human can find the right file.

Cars used to be simple enough that a motivated hobbyist could figure them out. Gradually that changed. Modern cars are complex computer systems on wheels. When something goes wrong, the first thing any mechanic does is plug in a diagnostic computer. The car still gets fixed, often faster than before, but full comprehension is no longer the baseline requirement.

Software engineering is quickly starting to look like that. You can get useful behavior out of a system without really understanding how it works. Prompt an AI for a feature, get a 400 line diff back and it looks plausible enough to ship. Often it works as asked. The mental model is abstracted away through a set of prompts.

All software ages. Requirements change, and before long someone is staring at code with no author and no memory of why it was written that way. Then the bugs come in. You're debugging code written by an LLM, asking another LLM to fix it, and watching it adapt to one requirement while regressing on two others.

LLMs can genuinely assist with writing software. But the more the mental model gets outsourced, the harder it becomes to fill the gap when needed. We keep plugging that diagnostic device back in, over and over, hoping it will pick up something it missed before. Maybe the context will be a little bit better this time. A slightly different prompt, a small tweak to the instructions. The tooling building the context is moving faster than the underlying capability¹.

At some point, someone still needs to roll up their sleeves and actually understand what's going on. The need to have a clear mental model of a software system has not yet gone away. It has just become much easier to avoid building one.

ARC-AGI 3 is a benchmark for AIs designed to measure reasoning closer to human fluid intelligence. Current LLMs score below 1%. Humans pass comfortably.

Want more content like this?

Hi! I sporadically write about tech & business. If you want an email when something new comes out, feel free to subscribe. It's free and I hate spam as much as you do.

The fading mental model of understanding software