Who’s Teaching Whom? The Future of AI Code Training

May 23

As AI gets better and better at writing software, there's a weird and unsettling question lurking in the background: if engineers start doing most of their coding with the help of AI, where does the training data for future AI models actually come from? In other words — if the AI is learning from us now, but we eventually learn mostly from the AI, who is teaching whom?

Enter: Coding Model Collapse

This is tied to something researchers call model collapse — basically, the idea that if you train AI systems mostly on other AI-generated outputs, quality starts to degrade over time. You end up in a feedback loop of derivative, safe, pattern-repeating content, and eventually things get a little...boring. Or worse, subtly wrong, or unable to solve truly novel problems in new domains. Most of the training data that powered today’s code-writing models came from human developers — people solving real problems in present-day real-world contexts. If that original, diverse input starts drying up or getting old and stale, what happens next?

The "Soul" of the Code

It’s like what people say about AI-generated art. There’s this common critique that AI art lacks "soul" — that spark of emotion or intent that comes from a human creator’s lived experience. And if you buy the argument that good code is its own kind of art (I do), then yes, that analogy tracks. When code is just an echo of other machine-generated code, it risks losing the depth, creativity, and weirdness that human engineers bring to the table.

The Human Touch

I wrote a post a while back in which I suggest engineers think twice about posting their code publicly — or at least to consider licenses that restrict its use for training AI. Not because I’m anti-AI (far from it), but because if you're worried about machines eating your job, you probably shouldn’t be feeding them too eagerly.

Why Humans Still Matter

That said, I don’t think humans are getting pushed out of the loop anytime soon. In fact, I think human developers are becoming even more important — just in slightly different ways. Here’s why:

New stuff still needs humans. In fields like quantum computing, synthetic biology, or brand-new hardware platforms, there just isn’t any training data yet (at least not for big public foundation models that most coding assistants end up relying on in one way or another). People must make the first moves, build the first tools, define the initial best practices, and crucially make the data – the code – available for training. AIs can’t help much until that groundwork is laid.
Prompting is a creative act. Even if you’re practicing “vibe coding” or otherwise letting the model write the code, you're still defining the problem, steering the solution, and deciding what "good enough" looks like. That back-and-forth — the prompt, the review, the revision — it’s all valuable signal for future training.
Training data is going to get more interactive. Instead of just scraping GitHub, the next wave of models will probably learn from conversations, debug sessions, and collaborative workflows that take place in the AI-assistant-enabled IDE. We’re not just teaching the AI what we built — we’re teaching it how we think.
AIs don’t break paradigms — people do. AI is great at remixing what already exists, but real innovation usually comes from someone breaking the rules, going off-script, or inventing a whole new script. That’s still us for the foreseeable future.

Time to Think Strategically

If you’re leading an engineering team, managing a product org, or just trying to make sense of where AI fits into your workflow, this is the time to think strategically. AI coding tools are here, and they’re not going away — but the smartest companies will be the ones who figure out how to use them without losing their edge.

I work with orgs to navigate this shift — technically, ethically, and operationally. If you're adopting LLM-based tools, rethinking how your team writes code, or trying to stay competitive in an AI-saturated world, I’d love to help.

Get in touch if you want to share ideas or bring me in for a consult.

Scott McMaster https://www.smcmaster.com