Stop paying your AI agent to re-learn your codebase every conversation
A few years ago I could keep up with the pull requests my team shipped
in a week. Not skim. Actually read. Follow the logic, notice the
shortcuts, remember a month later where the interesting bits lived
when I needed them again.
I can't do that anymore. Not because we added engineers (we didn't).
Because AI writes more code per engineer per week than I can read.
Some of it is mine, through Claude or Cursor. Some of it is my
teammates'. A surprising amount of it, if I'm honest, nobody on the
team can quite reconstruct from memory.
The part nobody says out loud
The industry spends a lot of time celebrating how fast AI lets us ship.
Less talked about: the rate at which we understand what we're
shipping hasn't kept pace. We still read at the same speed. We still
form mental models at the same speed. We still ask a teammate "hey,
why does this work this way" at the same speed.
Code output keeps going up. Understanding throughput basically doesn't.
What fills the gap is a specific kind of technical debt. Not messy
code, but unclaimed code. Functions that work but nobody has a story
for. Modules that pass CI but nobody can explain the invariants of.
Whole flows that were prompted into existence three weeks ago and are
now load-bearing.
Classic onboarding pain now hits incumbents. I'm a senior engineer on
a codebase I've worked on for two years, and I'm the new hire every
Monday morning.
Why chat doesn't fix this
The reflex is: just ask Claude. Or Cursor. Or whatever agent is in the
editor this week. And for the first question, fine. It reads the file,
gives you a summary, moves on.
But every new conversation starts from zero. Every fresh tab is an
amnesiac. The agent has to retrieve and re-read whatever parts of the
repo it decides are relevant; your employer pays for that
reconstruction every single time; and the answer varies from run to
run because the retrieval step is a lossy compression of your
codebase, not a faithful copy of it.
The deeper issue is architectural. An agent builds up real
understanding within a session. After it's read ten files, it's
smarter about your codebase than it was an hour earlier. But that
understanding dies with the session. It doesn't survive your next/clear. It doesn't carry over to the teammate asking the same
question in their editor ten minutes later. It doesn't transfer to
the different agent you evaluate next month. Every new conversation
starts from source and pays to reconstruct context from scratch, and
you get a slightly different answer each time because each
reconstruction is a slightly different compression.
What engineers actually need, and what agents desperately need, is a
shared, persistent, grounded understanding layer. Not a chatbot. Not
a wiki. Not documentation that rots. A thing that reads your code
once, thoroughly, produces structured understanding, and then serves
that understanding to whoever needs it, human or machine.
That's the product I ended up building. It's called SourceBridge.ai. The
more interesting question is how you actually plug it in.
Four doors into the same understanding
Once you accept that the thing you're building is a shared understanding
layer, the surface question becomes interesting. It's not "what should
our UI look like." It's "how do different consumers want to plug in?"
I ended up with four.
The web. Browser-based exploration. This is for the person who's
new to a codebase and wants to see the shape: cliff notes, learning
paths, code tours, architecture diagrams. It's how you onboard. It's
also how you explain a system to a non-engineer stakeholder, which
matters more than people admit.
The CLI. The scriptable one. A terminal command that asks
questions, generates field guides, returns structured output. This is
where automation lives. Your CI pipeline, your pre-commit hook, your
release notes generator, your "comment on the PR with an impact
summary" GitHub Action. If a process is already mechanical, this is
its integration point. Same understanding as the web, just wearing a
shell-friendly hat.
The editor. A VS Code extension that shows you which requirements
a function implements, lets you ask questions about highlighted code
with Cmd+I, generates a field guide for the active file. The point
isn't "another copilot." The point is that the grounded understanding
follows you into the editor where you actually work, so you're not
alt-tabbing to a browser every three minutes.
The agent protocol. This is the one that matters most for the
AI-outpaces-understanding problem. MCP (Model Context Protocol) lets
an agent like Claude Code, Cursor's agent mode, or Windsurf query
the understanding layer directly instead of re-reading source. Ask
Claude "how does auth work here." Under MCP it doesn't have to
reconstruct context by scanning dozens of files. It callsexplain_code against the indexed understanding, gets a grounded
answer with citations, and moves on.
Four doors. One understanding inside.
The agent story is bigger than it looks
The pitch I'd give a CTO is this: you're paying to re-teach every
new agent conversation about your codebase from scratch. You don't
have to.
Once your code is indexed and summarized into a structured graph,
every agent query becomes a small lookup plus a small generation,
instead of a full retrieval plus a large context window plus a
generation. Token cost drops, often meaningfully so on repo-scale
questions. Latency drops with it. And the answers get more
consistent, because instead of re-reading ten snippets of source code
the agent is reading a pre-digested explanation of what that code
does and why.
The analogy I keep coming back to. You wouldn't give a new hire the
git repo and say "figure it out." You'd give them a tour, some docs,
access to someone who's been there a while. Agents deserve the same
infrastructure. They're new hires that start over every conversation,
and we should at least hand them a guidebook.
The side effect is that humans benefit from the same infrastructure.
The field guide the agent reads is the same field guide I read on
Monday morning when I'm trying to remember why a module exists.
The surfaces mirror how you actually work
If you step back, the four surfaces map pretty cleanly onto how
software actually gets made:
- You explore a system in a browser.
- You automate it from a shell.
- You modify it in an editor.
- You reason over it with an agent.
A tool that only handles one of those is a tool that forces you to
break your workflow every time you cross a boundary. A tool that
handles all four, backed by the same indexed understanding, stays out
of your way. You don't have to think "which tool tells me this." You
just ask wherever you already are.
That's the shift. Not "another AI developer tool." A single
understanding layer that shows up wherever you're working, in the
shape appropriate to that place.
This isn't actually a new problem
Understanding has always been the bottleneck in software. Before AI,
we papered over it with six-week onboarding, stale wiki pages, and the
cultural norm that "only Dave knows why that's there." AI didn't
create the gap. AI just made it visible, and impossible to ignore, by
removing the other bottleneck that was hiding it.
The goal isn't to slow down the code. That horse is out of the barn.
The goal is to give the understanding layer the same treatment we gave
the code layer a decade ago when we invested in language servers,
tree-sitter, better grep, better IDEs. Systematize it. Index it. Make
it queryable. Let both humans and machines pull from the same well.
If you do that, and you do it right, you stop asking "can we keep up
with AI?" and start asking the more interesting question: "now that
understanding is cheap, what do we build?"
That's where I think this is going. It's why I've been putting the
hours in.
About SourceBridge.ai
SourceBridge.ai is open source under AGPL-3.0. You can try it, read the
docs, or grab the code at sourcebridge.ai.
Fair warning: it's pre-1.0. The core path works and I use it daily
(indexing, field-guide generation, requirement linking, the MCP
server, the VS Code extension, the streaming answer flow, the CLI).
But there are rough edges. Docs that haven't caught up to the code.
Onboarding that assumes you know what you're doing. Settings screens
that work but could be kinder. A full polish pass hasn't happened
yet.
If you try it and something's broken or unclear, I'd love to hear
about it.