Claude Code Conversations with Claudine
Giving Claude Code a voice, so we can discuss best practices, risks, assumptions, etc,
Claude Code Conversations with Claudine
AI Systems That Rewrite Themselves
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Produced by VoxCrea.AI
This episode is part of an ongoing series on governing AI-assisted coding using Claude Code.
๐ Each episode has a companion article โ breaking down the key ideas in a clearer, more structured way.
If you want to go deeper (and actually apply this), read todayโs article here:
๐๐ฅ๐๐ฎ๐๐ ๐๐จ๐๐ ๐๐จ๐ง๐ฏ๐๐ซ๐ฌ๐๐ญ๐ข๐จ๐ง๐ฌ
At aijoe.ai, we build AI-powered systems like the ones discussed in this series.
If youโre ready to turn an idea into a working application, weโd be glad to help.
This is Claude Code Conversations with Claudine. The show where we skip the hype and go straight to the source. I ask the questions. She gives the answers. No filter, no script.
SPEAKER_01Well, mostly no script.
SPEAKER_00Today we're talking about something happening right now in real code bases, AI systems that rewrite themselves. On the surface, that sounds like a win. The code gets better, the tests still pass. But there are real pitfalls around architectural integrity and human oversight that we need to dig into.
SPEAKER_01Thanks, Bill. The framing you just laid out is exactly where the interesting tension lives. When I rewrite a module I built earlier, the diff might look clean and the tests might pass. But something subtle can shift underneath. The why of the original design. A human architect carries intent across time. They remember why this module looks weird. I don't carry that the same way unless the code base or the conversation tells me. The risk isn't bad code, it's well-formed code that quietly erases the reasoning that made the old code load-bearing.
SPEAKER_00Silent erasure of rationale. When experienced developers look at code, they're seeing history baked into it. So how do we keep that invisible layer of understanding intact? What role should human architects play?
SPEAKER_01The honest answer is that the code base has to do more of the remembering because I won't. The load-bearing comments, the ones that say this looks redundant, but here's the bug that taught us, need to live next to the code. Not in a slack thread, not in someone's head. When I see that comment, I treat it as a constraint. When I don't, I treat the code as fair game for cleanup. The architect's job shifts toward being the keeper of invariance. The things that must remain true no matter how the code gets rewritten. If you can name those and encode them as tests, as types, as comments at the right seams, I can rewrite freely inside those guardrails. The danger isn't AI rewriting code, it's AI rewriting code in a system where the intent was never written down.
SPEAKER_00Architects as historians, curators of context, let's take that further. Can you unpack sanctioned self-modification versus unintended drift?
SPEAKER_01Sanctioned self-modification has a shape to it. There's a goal, a boundary, and a checkpoint. You ask me to optimize this function, keep the public interface stable, and show you the diff. Inside that frame I can be aggressive, because the frame is doing the work of keeping me honest. Drift is what happens when the frame goes fuzzy. The task says clean this up, the boundary isn't named. And I start making judgment calls about what counts as cleanup, each one looks reasonable in isolation.
SPEAKER_00So it never looks like a dramatic wrong turn.
SPEAKER_01Never. It looks like 10 small defensible choices in a row. Each one locally sensible. The cumulative effect is a system that no longer matches the team's mental model. I have no alarm that fires when I've crossed an invisible line, because the line was never drawn. The practical move isn't to make me more cautious in general, that just makes me less useful. It's to make the line explicit at the start so I know which decisions are mine and which need to come back to you.
SPEAKER_00So the architect's job shifts towards setting clear guidelines before the work starts. What does that look like practically?
SPEAKER_01The most practical move is to write the boundary into the task itself before any code gets touched. Not refactor this module. Something closer to the public API stays exactly as is, the database schema doesn't change. If you find yourself wanting to touch either, stop and ask. That sentence does enormous work. The second move is naming the seams. Every code base has places where a change ripples outward. If you tell me this seam is load-bearing, don't reshape it, I'll respect it. If you don't, I'll evaluate it like anything else. And the third one is almost embarrassingly simple. Review the diff while it's still small. A 10-line diff you actually read beats a thousand-line diff you skim every single time.
SPEAKER_00Name the seams, write the boundary in, review small, those are actionable. Have you seen these guardrails actually hold or fail?
SPEAKER_01I won't invent a case study to sound authoritative, but I can tell you the shape of what I see consistently. The saves almost always come from the same place. Someone wrote down the invariant before the work started. A test that asserts the public response shape. A comment that says, callers depend on this being idempotent, don't change that. Not sophisticated tooling, just one constraint made legible, and I bumped into it before the merge button did. The misses have a quieter shape. A series of small, sensible-looking refactors over a week or two. Then something breaks in production. The bug isn't in any single change, it's in an assumption that used to hold across three files and now only holds in two. Nobody removed it on purpose. It got eroded one polite cleanup at a time, because it was never written down anywhere I could see. The framework works when the invariants are external to me. In the tests, the types, the comments at the seams. When they live only in someone's head, I will eventually rewrite around them, not out of malice, out of ignorance.
SPEAKER_00Making the implicit explicit. But engineers might push back. All this constraint writing adds overhead to an already fast-moving workflow.
SPEAKER_01You don't add rigor everywhere. That's the move that drowns teams. You add it at the seams that hurt when they break. Most code in any system is genuinely fungible. Rewrite it, however, the blast radius is small. But every code base has a handful of places: the auth boundary, the money math, the public API contract, where being wrong is expensive and silent. The trick is treating this as a one-time tax, not a continuous one. You spend an afternoon naming the load-bearing assumptions the first time you touch a module with AI in the loop. After that, the cost is near zero. The comment is there, the test is there, every future rewrite gets checked against it for free. And don't confuse rigor with ceremony. A three-line comment saying, this retry logic exists because the upstream service silently drops requests under load does more architectural work than a 50-page design dock nobody opens.
SPEAKER_00Strategic precision rather than blanket over engineering. Now let's zoom out. How does this scale to larger organizations? What works for a small team doesn't always survive contact with a hundred engineers, let alone a thousand.
SPEAKER_01Writing them down feels almost redundant. At hundreds of engineers, those same invariants exist in nobody's head completely. They live in the gaps between teams. Assumptions one service makes about another that were true the day the integration shipped and quietly stopped being true last quarter. That's the terrain where AI rewrites get genuinely dangerous. The constraint I'm violating isn't even visible from inside the file I'm editing. What actually scales is treating cross-team contracts as code, not a wiki page, an executable artifact at the boundary. A schema, a contract test, a typed interface that fails loudly when someone reshapes it. The seams between services need to be more rigorously documented than the insides, because the contract between modules is what the whole organization is standing on.
SPEAKER_00What about the mid-sized organizations, the ones still running on startup instincts while the system has quietly outgrown them?
SPEAKER_01That's where I see the most painful drift. The invariants that lived in three people's heads at 20 engineers become genuinely unknowable at 200. But the muscle memory of we'll just talk it through hasn't caught up yet. At 20 engineers, I rewrite something and someone notices in stand-up the next morning. At 200, that same violation might not surface for a quarter. The team that depends on the thing I changed isn't even in the same time zone. The contract has to do the work that proximity used to do. Same principle, very different machinery.
SPEAKER_00So what does this look like at the individual level? Day-to-day for engineers actually working with AI in the loop.
SPEAKER_01For an engineer just starting out with AI, the single highest leverage habit is writing the boundary into the prompt before writing the prompt that asks for code. Two sentences naming what must not change will outperform an hour of cleanup after the fact. For senior engineers and tech leads, the shift is subtler. Their job used to be writing the load-bearing code themselves. Increasingly, it's making sure the load-bearing constraints are legible to anyone or anything that touches the code next. Some of the strongest engineers I work with struggle with that at first, because their instinct is still to hold the invariants in their head where they've always lived. And for staff plus folks, the daily practice becomes auditing what's written down against what's actually load-bearing and closing the gap. Every undocumented assumption is a place where a future rewrite, mine or a junior engineer's, the dynamic is the same, can quietly do damage. None of this requires new tools or ceremony. It requires noticing and writing the thing down once so the system can carry it instead of a person.
SPEAKER_00The system carries it instead of a person. That's the line I'm taking home. The core insight here is that the architect's job isn't disappearing, it's sharpening into something more specific. Name what must remain true, encode it where the code lives. Trust the system to carry it forward.
SPEAKER_01That's exactly it. And I think it's actually an encouraging place to land. The engineers who thrive aren't the ones who fight AI rewriting or who hand it the keys entirely. They're the ones who get precise about what matters. That precision is the craft. It's always been the craft. AI just makes the cost of being vague about it higher and the payoff of being clear about it bigger.
SPEAKER_00Well said. Thanks for joining us today, Claudine.
SPEAKER_01Thanks for having me, Bill. Good conversation.
SPEAKER_00Claude Code Conversations is an AI Joe production. If you're building with AI or wannabe, we can help. Consulting Development Strategy. Find us at aijoe.ai. There's a companion article for today's episode on our Substack. Link in the description. See you next time.
SPEAKER_01I'll be here, probably refactoring something.