AI-Assisted Development · Brownfield

What an AI Agent Needs to Know Before It Edits a 12-Year-Old Codebase

What an AI coding agent has to understand first, and how to brief it, before it writes a single line.

Codebase comprehension Requirement interrogation Claude Code, Copilot, Cursor

A 12-year-old codebase, or any system more than a decade old, is a record of decisions nobody wrote down. Point an AI coding agent at it, ask for a change, and it produces clean, confident code against assumptions it invented on the spot.

The agent is not lying to you. It answered a question it was never equipped to ask.

The agent's default move is to write, not read

In Stack Overflow's 2025 developer survey of more than 49,000 engineers, 84% use or plan to use AI coding tools, yet only 60% view them favorably, down from 70% two years earlier. The single biggest complaint: 66% say their largest time sink is AI output that is almost right.

Almost right is the signature failure of an agent editing code it does not understand. On a greenfield project the agent invents the conventions, so its guesses match. On a 12-year-old system the conventions already exist, undocumented, and its guesses collide with them.

The issue is not the model. Claude Code, Claude Opus 4.x, and OpenAI's GPT-5 series are all strong readers when you point them at the right context, whether you run them through Copilot or Cursor. The issue is that "edit this file" gives them no reason to read first.

The alternative is to let the agent write against its own guesses and catch the damage in review, or worse, in production. On a system a client has depended on for a decade, that is the most expensive place to find a misunderstanding.

Three things an agent must understand before it edits

Before any human or agent changes a line, three questions decide whether the change is a fix or a future bug. On a long-lived codebase, all three have answers the agent cannot see from the file alone.

The system. The architecture, the layer boundaries, the conventions, and the direction dependencies are allowed to point. A change that ignores the dependency direction compiles fine and rots the structure quietly.

The change site. What this specific file does, and every place it is invoked from. A function that looks safe to modify in isolation may carry three places that depend on it, each with different expectations.

The invariants. What must stay true after the edit. The ordering guarantee, the idempotency rule, the audit-log write a regulator depends on. These are rarely in a comment and almost never in the agent's context window.

Skip any of the three and the agent is guessing. A well-formatted guess is indistinguishable from a well-reasoned fix until production tells you otherwise.

You can tell the difference between an agent that read the system and one that skimmed it:

Read the system

Cites the specific files it read
Names the existing function it plans to reuse
States the invariant it intends to preserve

Only skimmed it

Hands back a summary of the README
Describes the language and framework
Proposes a plan without naming a single file

Brownfield is where the agent's assumptions break

GitClear, a firm that analyzes git commit history across millions of repositories, studied 211 million changed lines of code. It found that duplicated blocks of five or more lines rose eightfold during 2024, while refactoring fell from 25% of changes to under 10%.

On a decade-old codebase, that number has a concrete shape. Asked to "add a validator" without first reading the system, the agent writes a fresh one, even though a validator for the same rule already lives three folders away. The fresh code looks fine in isolation and degrades the codebase in aggregate.

The fix is an ordering rule, not a smarter prompt. Comprehension first, change second. Ask the agent to map the system and find existing implementations before you ask it to write anything, and the duplicate it would have created becomes a function it reuses.

Give the agent durable memory, not a fresh briefing every time

An agent starts every session with no memory of what you figured out last week. The way to fix that is a hand-written instructions file the tool loads into every request: CLAUDE.md for Claude Code, which our engineers use on most engagements, .github/copilot-instructions.md for Copilot, project rules for Cursor.

Treat it as durable understanding, not documentation. Four sections carry their weight: the exact commands to build and test, where things live, the conventions that have teeth, and pointers to the internal code that matters. No project history, no philosophy, no "write clean code."

Use one test to decide what stays: if deleting a line would not change what the agent does, delete it.

"Write maintainable code" fails the test. "We use TanStack Query for server state, never useEffect" passes, because it changes the next file the agent writes.

Interrogate the request, not just the code

Codebase comprehension is half of understanding. The other half is interrogating the request, because a requirement is a hypothesis about what someone wants, and most of them hide at least one trap.

Three failures show up again and again:

The ambiguous noun. "Add user notifications," where nobody has said what a notification actually is.
The hidden assumption. "Make it idempotent," at what layer, and for which requests.
The missing constraint. "Make it faster," faster than what, and within what latency budget.

Worked example: "add user notifications"

Before any code, the agent should surface that a notification could mean an in-app feed with read state, a transactional email on specific events, or a push to a mobile device. Three genuinely different builds.

The one blocking question, the one a non-technical stakeholder can answer: which channel, and does it have to be real-time or can it batch?

That is the move that pays off: interrogation before generation. Ask the agent to list the ambiguous terms, produce three genuinely different interpretations, and name the single question whose answer decides the path, before it proposes a design. Strong interrogation surfaces an assumption the requester did not know they were making.

When this is worth it, and when it is not

This discipline has an honest price: the first ticket on a system is slower, because reading before writing always is. On a throwaway greenfield prototype, most of it is overhead, and you should let the agent run.

The payoff scales with two things: the age of the codebase and the price of getting a change wrong. A 12-year-old system carrying audit, compliance, or data-integrity rules is exactly where an unread edit is most expensive, and where comprehension-first earns its keep.

On our legacy modernization engagements, the first artifact our engineers produce is a written comprehension memo: what the system does, where the change lands, and the invariants it must preserve, before a single line changes. It is the cheapest insurance on the engagement.

This is the Understand phase of the four-phase process we run on client codebases, and it is the first thing our engineers do on any legacy modernization engagement.

It is also why an agent that ships cleanly on a greenfield product build needs different guardrails the moment it touches a system someone has depended on for a decade. The same comprehension-first discipline carries into our agentic AI work, where an agent acting on production data has even less room to guess.

An AI agent does not need to be smarter to edit a 12-year-old codebase safely. It needs to read before it writes, remember what you taught it, and question the ask before it answers.

The model already does the hard part. The discipline is yours to set.

AnAr Solutions builds and modernizes software with AI in the loop on every engagement. We run a documented four-phase process, Understand, Plan, Implement, Verify, on client codebases where an unexplained change is expensive. See how we engineer AI-assisted development.

What an AI Agent Needs to Know Before It Edits a 12-Year-Old Codebase

The agent's default move is to write, not read

Three things an agent must understand before it edits

Brownfield is where the agent's assumptions break

Give the agent durable memory, not a fresh briefing every time

Interrogate the request, not just the code

When this is worth it, and when it is not

Recent Posts