Your Codebase Isn't Ready for AI — But You Knew Better

I've heard engineers say "AI won't work in our repo." I've also talked to peers where it's transforming how their teams ship. The difference isn't the tools — it's the codebase underneath them (and some trust issues).

There's hope here. But only if you do the work first.

AI is a multiplier. It multiplies whatever engineering culture already exists — and exposes every process inefficiency you should have fixed a long time ago.

Disciplined, tested, documented codebase → AI makes it dramatically faster. Undocumented, undertested, architecturally opaque codebase → AI accelerates the debt.

After a decade leading engineering organizations across hundreds of engineers, monoliths, and microservices, here are the 7 principles I'd act on immediately.

What "AI-Ready" Actually Means

Before prescribing what to do, it's worth being precise about the problem.

An AI coding agent — whether it's generating new code, reviewing existing code, or attempting to fix a bug — needs to answer four questions about any codebase it touches:

What does this code do? (Comprehension)
What does correct behavior look like? (Verification)
Where does this fit in the larger system? (Context)
What are the rules I must follow here? (Standards)

A brownfield enterprise codebase typically fails all four.

Comprehension fails because brownfield code is dense with implicit context. Variable names like svc2, methods called doProcess, and classes modified by forty developers over a decade without a single architectural comment. The AI can read the syntax. It cannot read the intent.

Verification fails because most brownfield systems have inadequate automated test coverage. When AI generates or modifies code, it cannot validate its own output. It needs a test suite to run. If your coverage is 20%, the AI operates in near-darkness.

Context fails because brownfield systems have accumulated abstractions, workarounds, and undocumented dependencies that don't appear in any single file. People forgot why the thing was built that way and what the use-case was. The UserService calls the LegacyBillingAdapter which has a race condition on Tuesdays because of a cron job in a completely different repository. AI doesn't know this. Nobody wrote it down.

Standards fail because brownfield organizations typically have accumulated multiple generations of style guides, framework versions, and architectural patterns — often coexisting in the same codebase or divided across teams. AI will follow whatever pattern it sees most frequently, which in a brownfield system is often the worst one.

AI-readiness is the state in which your codebase can answer all four questions reliably and consistently. You don't need to be perfect. You need to be clear.

7 Principles (Positions)

1. Test Coverage Isn't a Quality Metric — It's an AI Prerequisite

Below 70% (but striving for 95%) test coverage on your critical services, and AI is operating in the dark. It generates code it cannot verify. You don't have an AI assistant — you have a code generator that someone else has to validate by hand across every dependent area of the codebase.

Think of it this way: how much easier is it to onboard a junior engineer when they can just read the tests to understand what the code does? Consider your AI assistant that junior — and help them onboard easier.

The threshold that matters: 70% is the floor, 95% is the goal. Below that floor, you're not using AI as a coding assistant. Above 95%, AI can operate with genuine confidence and autonomy.

What to do this week: Pull your coverage report. Identify your five most critical services below 70%. Assign a sprint to each. Frame this to your teams not as "writing tests" but as "making the codebase legible to AI" — because that's exactly what it is.

The uncomfortable truth: if your organization has been shipping without tests for years, you have a cultural problem, not a technical one. Tests didn't get written because the incentive structure didn't reward them. AI doesn't change that incentive structure — it just raises the stakes. You need to change the culture, not just the coverage number, and overtime this removes the dependency on your code reviews (the next AI bottleneck) requiring institutional knowledge AKA human eyes.

2. Undocumented Architecture Is Organizational Debt

"The code is the documentation" was always wrong. For AI, it's catastrophic.

AI can read your syntax. It cannot read the 2017 conference room conversation about why you put all user state in Redis instead of Postgres. It cannot read the Slack thread where someone mentioned a known race condition in the payment service that was deprioritized indefinitely. It cannot read the institutional knowledge of the one principal engineer who has been here since the beginning.

Architecture Decision Records (ADRs) are non-negotiable. Every significant architectural decision needs to exist as a document in the repository — not in Confluence, not in a Google Doc. In the repository, version-controlled, searchable by AI.

This is not a documentation project. It is an AI-enablement project, and it pays compound interest. As a matter of fact, your software projects have all become AI-enablement projects, too. Teams that adopt ADRs report faster onboarding, fewer repeated debates, and dramatically better AI output.

What to do this week: Ask an LLM — using context you've already given it from previous Chat sessions or an MCP integration with your codebase — to help you map your services and dependencies. Use that as both your ADR starting point in the project and the global template going forward. Let AI help you document itself.

3. The Paved Road Must Exist Before AI Can Follow It

"Paved road" is the concept of providing a well-maintained, well-documented path for common engineering tasks — a standard way to create a new service, a standard way to add a database migration, a standard way to instrument observability.

AI is the fastest paved-road follower in the world. Give it a clear pattern and it reproduces it at speed, cycling until it gets it right. Give it no pattern, and it invents one to please you — likely drawing on your worst anti-patterns, since those appear most frequently in a brownfield system.

Every AI-ready codebase needs: a service creation scaffold, standard patterns for logging and tracing with real examples, a documented database migration process, a contribution guide that explains the why behind style decisions, and clear CODEOWNERS. You don't have to be an overachiever here, and can iterate over time.

Try this: Ask five engineers to create a new endpoint following your standard patterns. If you get five different approaches, you don't have a paved road — but now you can use these different approaches to standardize on the best one and use it as the canonical example going forward.

4. Dependency Rot Is the Silent AI-Readiness Killer

Brownfield systems accumulate dependencies the way old houses accumulate furniture. Pinned packages from 2019. Transitive dependencies nobody audited. Known CVEs that "work and nobody wants to touch." AI cannot reason well about dependency context it hasn't been trained on — and AI-generated changes in an unmanaged dependency ecosystem create unpredictable blast radius.

The tools that make this largely automatic — Renovate, Dependabot, Socket — exist and are free. Use them!

What to do this week: Enable Renovate or Dependabot on every repository. Run a dependency audit. Identify critical CVEs and triage them. Assign ownership — even if that owner is an AI agent.

5. Document What You Know. Discover What You Don't.

Before AI can help you move fast, it needs to understand what it's reading — and so do you. Consistent naming, small focused functions, clear module boundaries: these aren't perfectionism, they're how you expose what your system actually does versus what you think it does. All the work above helps with this immensely.

Documentation isn't just output. It's a diagnostic. When you can't document something clearly, that's a signal — either the code is too complex, or your understanding has gaps. Both are problems worth finding before AI does.

I love the old mantra: "If it's too hard for you to write the test — it's not you, or the test. It's the code." The same applies to documentation. If you can't explain a module in two paragraphs, the module is the problem.

What to do this week: Pick one area of your codebase that causes the most confusion in code review. Document it. If you can't explain it clearly, that's the next issue to fix — and the documentation attempt just found it for you.

6. Observability Means Connecting Work to Intent

AI agents that can correlate their output back to the intent behind a change — the ticket, the user need, the architectural goal — can self-correct in ways that purely syntax-aware tools cannot.

Structured logging and error tracking are the baseline. They make your system readable to both humans and AI. But the real unlock is traceability: can you tell, within minutes of a deploy, whether what shipped actually did what it was supposed to do? Can you correlate a work item to its intent — answer why it was delivered that way?

This connection between intent and delivery is where AI moves from "code faster" to "ship better." It starts with making your system's behavior observable and anchoring that observability to the problem being solved, not just the code being shipped.

What to do this week: Audit your logging. Are errors structured and searchable? Can you correlate a deploy to a specific ticket within minutes? If not, that gap is the first place AI self-correction will fail.

7. You Cannot AI-Enable a Codebase You Don't Understand Yourself

This is the hardest principle — and the most important.

If there are parts of your system that no current engineer fully understands, AI will confidently generate incorrect code there. It will look right. It will at some point pass review. And you will have an incident without understanding root cause or how to mitigate.

The fix: treat it as an archaeology project. Pair a senior engineer with someone newer. Walk through the unknown territory together. Write down what you find — ADRs, data flow diagrams, "here be dragons" comments. Build the tests that prove your understanding.

Then bring in AI to do the same exercise — and correct it where it's wrong by adding new language or context to your prompts. Build confidence when it's right that you can trust it in the specific scenarios you now understand first-hand.

Do this even when it feels slow. The manual work builds the trust and confidence you'll need to delegate to AI later. You can't verify what you don't understand — and neither can it.

This work has a compounding return with or without AI. Pre-AI, it made systems more maintainable and developers happier — and you could still ship. Post-AI, it unlocks everything else on this list and becomes the barrier to shipping if you skip it.

The 30-Day Action Plan

Days 1–5: Assess

Pull test coverage reports for your five most business-critical services
Run a dependency audit on every repository
Ask three engineers: "What's the part of our codebase you'd least like to change?"
Identify your most undocumented architectural decision — the one that always generates questions in onboarding

Days 6–15: Fix the Floor

Enable Renovate or Dependabot on every repo
Assign coverage remediation to your lowest-covered critical service
Use an LLM to help map your services and dependencies; save that as the first ADR
Update your contribution guide with one concrete pattern example

Days 16–30: Build the Runway

Define your paved road for new service creation
Audit logging — structure errors, make them searchable
Create CODEOWNERS for your highest-risk areas
Run a team session: "What does good AI-generated code review look like?" Document the answer

Conclusion: AI Won't Save a Codebase You Haven't Invested In

It's easy to skip this work when there are deadlines to meet. But getting to 95% test coverage is a press release for your teams — with AI or without it. And the teams that do this foundational work are the ones who'll look back in two years and realize this is what 100x productivity actually means: not AI writing code faster, but AI compounding the good engineering practices you built deliberately.

The leaders who see the most value from AI in the next three years are not the ones who deploy it fastest. They are the ones who prepare deliberately, invest in the practices that make AI useful — testing, documentation, standards, observability, traceability — and then deploy AI into an environment it can succeed in.

AI will find out what kind of engineering culture you have. Better to find out yourself first — fix it, be proud of it, and watch your stronger team working on a more resilient codebase do things that weren't possible before.

Engineering leader. DevOps and AI-native development. Writing about what I'm actually seeing in the field.

More from engineeredbyai.com

View AI Agents for Engineering Leaders → Back to Blog →