Eliminating Waste in the SDLC

I wrote a follow-up exploring what happens after you eliminate the waste: The Processing Layer Is Not a Moat .

I don’t write much code anymore.

About six months ago, Dario Amodei predicted that within a year, 90% of code would be AI-generated. More recently, Boris Cherny — the creator of Claude Code at Anthropic — said that 100% of his code was already written by AI. I thought those numbers sounded aggressive. Now I’m living them.

My job today looks more like a conductor, a project manager, and a product owner than a software engineer in the traditional sense. I think about the product holistically. I write specifications. I build implementation plans. I spin up agents to do the development. I facilitate the pull request review cycle. I merge, deploy, and monitor. The full lifecycle — but I’m orchestrating it, not executing it by hand.

And the thing that accelerated this shift more than anything else wasn’t a better model. It was giving the AI access to my tools.

The Toyota Lens

There’s a concept from the Toyota Production System called muda — waste. Taiichi Ohno defined seven types: overproduction, waiting, transport, overprocessing, inventory, motion, and defects. The genius wasn’t the taxonomy. It was the process: stand on the factory floor, watch the work happen, identify what doesn’t add value, eliminate it. Then do it again. And again. Each pass reveals the next layer.

Software development has its own waste. We just don’t see it because we’re not standing next to a conveyor belt. We’re tabbing between Jira and the IDE, manually updating ticket statuses, copy-pasting error messages from Sentry into issue descriptions, checking dashboards that an automated system could read faster and more thoroughly than we ever could.

The question isn’t “how do I write code faster?” It’s “what am I still doing by hand that a machine could do better?”

The Breakthrough: Tool Handoff

Here’s the insight that changed everything for me: AI can use tools. The same tools I use.

This sounds obvious when you say it out loud. But the implications are enormous, and most people haven’t internalized them yet.

For months, I was the bottleneck in my own workflow. Not because I was writing code slowly — the AI agents were handling that. I was the bottleneck because I was still the one processing Jira tickets. I was the one triaging my sprint. I was the one updating the roadmap. I was the one checking Sentry for new exceptions, then cross-referencing GitHub to see if someone had already opened a PR for it. I was the one moving tickets between swimlanes, assigning story points, linking parent tickets, writing acceptance criteria descriptions.

All of that is work. Most of it adds no value that couldn’t be provided by a machine. It’s muda — motion without production.

Then I discovered that MCP servers existed for Jira and GitHub. MCP — Model Context Protocol — is a way for AI agents to interact directly with external services. Not through me copying and pasting. Directly. Claude talks to Jira. Claude talks to GitHub. Claude talks to Sentry. Claude talks to GCloud.

When I made that connection, the waste didn’t just shrink. Entire categories of it vanished overnight.

What Disappeared

Jira became automatic. Claude now manages the board. Every piece of work has a ticket — not because I’m disciplined about creating them, but because the AI creates them as part of its natural workflow. The tickets have proper descriptions, acceptance criteria, story points, and parent links. They move to the right swimlane at the right time. Sprint planning happens conversationally: I describe priorities, Claude structures the sprint, creates the tickets, and links them to the roadmap.

The quality of my project management improved when I stopped doing it myself. That’s not a comfortable thing to admit, but it’s true. I was inconsistent about writing ticket descriptions. I’d forget to update statuses. I’d let story points slide because pointing felt like overhead. The AI doesn’t forget. It doesn’t get lazy at 4pm on a Friday. It does the same thorough job every single time.

Sentry triage became proactive. I built a workflow where Claude checks Sentry for new exceptions, cross-references them against GitHub to see if there’s already a pull request addressing the issue, checks Jira to see if there’s already a ticket, and determines the next action. If the exception is new and undocumented, it creates the Jira ticket, assesses severity, and recommends whether it goes into the current sprint or the backlog. What used to be a 15-minute manual triage session — open Sentry, read the stack trace, search Jira, search GitHub, create a ticket, write up the context — happens in seconds. And it happens more consistently than I ever managed.

Infrastructure monitoring became conversational. A /devops command kicks off a sequential audit: GCloud health checks, Sentry exception review, Jira cross-reference, GitHub PR status. By the time I’ve finished my coffee, I have a full-stack health report with new issues already triaged and documented. That’s not a dashboard I’m reading. It’s an agent that did the reading for me and surfaced only what needs my judgment.

Upstream Waste: Specifications

With the operational overhead handled, a different category of waste became visible: building the wrong thing.

I audited eight pull requests — about 250 review comments total. The data was stark:

Category	% of Review Comments
Missing tests	24%
N+1 queries	14%
SQL injection	12%
Everything else	50%

Half the review cycle was catching preventable mechanical issues. That’s waste. But the more expensive waste was hiding in the other 50% — comments about misunderstood requirements, missing edge cases, features that worked but didn’t match what was actually needed.

The fix was investing more time at the beginning. Before any implementation starts, the feature gets acceptance criteria in Gherkin format:

Scenario: Admin generates a monthly usage report
  Given I am logged in as an admin
  When I navigate to Reports and select "March 2026"
  Then I see a summary table with active users, new signups, and churn
  And the report is available for download as CSV

Each scenario maps directly to a test. If I can’t write the scenario, I don’t understand the feature well enough to build it. And when I hand an agent an unambiguous Gherkin scenario, it builds exactly what I described — not what it guessed I meant.

For the mechanical issues, I built a pre-flight checklist derived directly from the PR audit data. Never interpolate values into SQL strings. No database calls inside loops. Write tests alongside implementation. Every rule came from an actual review comment on an actual PR. The checklist lives in the CLAUDE.md context file — every agent reads it before writing a line of code. A pre-commit hook enforces Rubocop compliance. The agents can’t ship code that violates the rules because the rules are in the environment.

The planning phase got longer. The total cycle got dramatically shorter.

Integration Verification

There’s one more layer of waste that took me multiple painful deployments to identify: false confidence from tests that don’t exercise the full stack.

Unit tests pass. Controller tests pass. The agent reports success. You deploy and the page is broken — a misnamed controller, a Stimulus identifier that doesn’t match, a Turbo frame the view wasn’t expecting. Everything worked in isolation. Nothing worked in the browser.

The fix is cheap: one system test per feature, running in headless Chrome, exercising the happy path end-to-end. It catches in seconds what manual QA catches in hours. Now every feature branch requires it. It’s the cheapest waste elimination I’ve found.

Everything Happens Simultaneously

Here’s where it gets interesting. These aren’t sequential improvements I made over months. They’re all running in parallel, right now, on every feature.

A single agent can review Sentry, GitHub, and Jira simultaneously — check if an exception is already documented, check if there’s a PR for it, and determine the next action. Another agent is implementing a feature from a Gherkin spec with the pre-flight checklist baked into its context. Another is running the DevOps audit. The sprint board updates itself as work progresses.

My role in all of this is the same as Ohno’s on the factory floor: observe the system, identify what doesn’t add value, and remove it. The difference is that every time I identify a category of waste, I don’t just fix it once — I encode the fix into the system so it never comes back. The pre-flight checklist is encoded. The Gherkin requirement is encoded. The Sentry-to-Jira triage workflow is encoded. The infrastructure audit is encoded.

Each elimination is permanent. And each one reveals the next layer of waste that was previously invisible.

What’s Left for the Human

After you strip away the code writing, the ticket management, the triage, the monitoring, the boilerplate, the mechanical review — what’s left?

Product discovery. Deciding what to build and why. Understanding users. Making judgment calls about trade-offs that require knowing the full context of the business, the market, and the technical landscape. Architecture decisions that determine whether the system will scale or collapse under its own weight. And the ongoing discipline of watching the process itself, looking for the next pocket of waste to eliminate.

That’s not less work. It’s higher-leverage work. The work that actually determines whether the product succeeds or fails.

The Compounding Loop

I don’t think this is fundamentally a story about AI. AI is the catalyst, but the method is older than software.

Ohno stood on the Toyota factory floor for decades, watching the same processes, finding the next margin of waste to remove. The tools evolved — from kanban cards to just-in-time manufacturing to automated quality inspection. But the practice was always the same: observe, identify waste, eliminate it, repeat.

What AI changes is the speed of the loop. When you can hand your tools to an AI agent and let it operate them directly — not through you as a bottleneck, but alongside you as a parallel operator — the iteration cycle compresses from weeks to hours. You can run the waste-elimination loop an order of magnitude more often. And each loop reveals the next layer.

Last month, the waste was in the review cycle. I encoded the fix and it disappeared. This month, it’s in deployment verification and release management. Next month it’ll be something I can’t see yet — because I haven’t eliminated the thing that’s currently blocking my view.

The tools will keep getting better. The models will keep getting smarter. But the practice of systematically identifying and eliminating waste — standing on the factory floor, watching the work, refusing to accept motion without value — that’s the durable skill. That’s what compounds.

Every tool you hand to the AI is a category of waste you never have to manage again. Start handing over the tools.