"Just Reviewing Code"
_Soon, developers will 'just' be reviewing the work that AI tools do.
I hear this a lot lately, and it's worth unpacking. The idea is that as AI tools get better at writing code, the developer's role shifts from writing to reviewing. That might sound like a step back, like reviewing is somehow less of a craft. But I think that misses what reviewing actually is.
To me, reviewing the underlying wires of a product is one of the most impressive and important skills a developer can have. And I think the bar isn't getting lower, it's getting higher.
Code is a translation
Here's the thing about software: code is not the product. It's a translation. Someone has an idea, a need, a goal, a problem they want solved. A developer's job is to take that abstract thing and turn it into something that actually works in the real world, reliably, for real people and over time. It is not just about writing code that runs.
Think about what a developer actually holds in their head when they're doing this job well. It's not just the code itself, it's the translation between three different worlds:
- Business: What are we trying to accomplish, what "good" looks like at this stage of growth.
- Users: How people are actually using the product, including the ways nobody planned for.
- System: How this piece of code sits inside a larger universe, what it depends on, what depends on it.
Product managers see the surface. Leadership sees the strategy. But the developer is the one who can look at a three-line change and know whether it's going to quietly cause problems six months from now. That's a genuinely unique skill.
Not every change is a big deal. But the more complex the system, the more users, the more critical the feature, the more important it is to have someone who can understand the impact and make a call.
So what does reviewing actually take?
At its core, review is about one question: does this align with where we're going, and is it going to hold up over time?
Context. A startup moving fast has different tradeoffs than a company managing critical systems. Code that's totally reasonable in one context is a liability in another. The most secure approach might not be the most user-friendly. Simplicity now might mean inflexibility later. None of these have universal right answers — they depend on your business stage, your users, your risk tolerance. Someone has to make the call.
Reading what isn't there. Some of the most important things in a diff are the things missing from it. The error case that wasn't handled, the edge case nobody thought about, the test that would've caught the bug. The surface might look fine, but the cracks only show if you know the system well enough to look for them.
Consequences. How is this being monitored? What happens to your dashboards when this ships? What doors has this opened for an attacker? Breaking changes are brutal — if a feature is tightly coupled to everything else, untangling it later becomes expensive. These things hide in the details. If you don't know the ecosystem, you won't know what to ask about it.
As teams grow, you need more people who can make those calls, not fewer. The people closest to the code have to be empowered to think at that level.
The limits of automation
Evals can catch regressions and enforce consistency. LLM-as-a-judge can surface issues faster than a human reading every diff. These tools have real value.
But someone has to decide what the eval is even measuring. What counts as "correct" depends on your business stage, your users, your risk tolerance, your system's history. An eval that made sense six months ago might be measuring the wrong thing today. A judge never calibrated to your context will produce confident, consistent, wrong answers.
Evals drift. Models change. The product evolves. Someone has to own that process, understand why it's built the way it is, and have the judgment to know when it needs to change. Automation makes review more efficient — but the thing being automated is judgment about whether something is good enough, for this system, for these users, today. That part doesn't automate itself.
Reviewing is deciding
Natural language is ambiguous by design. Code isn't. When everything flows through prompts and summaries, that ambiguity compounds quietly. Reviewing means being able to drop to the level where it disappears.
Now, more than ever: building reliable systems for real people is fundamentally an act of decision taking, not just execution.
The best developers aren't just technically sharp. They're the ones who understand the translation: from business goals to understanding the users to system design to implementation details, and back again. They're the ones who notice when something is drifting out of coherence with where the product is headed. They're the ones who ask the uncomfortable question before the uncomfortable consequence.
Reviewing is not a checklist. Not a gatekeeping ritual. It is a form of ownership over something you understand better than almost anyone else in the room.
