Nobody Thought About the Drivers

‍

Last week, EY announced the global rollout of enterprise-scale agentic AI across 160,000 audit engagements in more than 150 countries. It's a remarkable achievement, and the press release covers all the things you'd expect it to cover: multi-agent frameworks, the Microsoft partnership, responsible AI principles, workforce training programs. What it has nothing to say about is what happens after deployment.

I don't raise that to criticize EY, because they're not alone in the silence. Across every regulated industry (audit, finance, healthcare, legal) organizations are deploying AI systems at a scale that would have been unimaginable five years ago, and almost none of them are asking the question that will determine whether those deployments succeed or quietly degrade: what happens when it breaks, and who in the organization knows how to drive it when it does?

We got a glimpse of what that looks like in March, when a security firm's autonomous agent breached McKinsey's internal AI platform in two hours. That's a platform used by more than 40,000 employees for strategy work and client research, with a vulnerability that had been sitting there for two years, invisible to conventional scanners. The most alarming detail wasn't the breach itself; it was that the instructions controlling how the AI responded to thousands of consultants were stored in the same database, meaning an attacker could have silently altered them, with no deployment, no code change, no traditional trace, and nobody would have noticed.

That's not a security failure. It's a design philosophy failure: nobody thought about the drivers.

‍

They bought the engine. Then they bolted on whatever was lying around.

Stewart Brand's new book Maintenance: Of Everything opens a conversation the technology industry urgently needs to have. Brand has spent decades thinking about how things last, and his argument, applied to AI, lands with unexpected force.

Here is the honest diagnosis of where enterprise AI actually is right now: organizations didn't build anything. They acquired an extraordinary engine. It's powerful, capable, genuinely impressive, but licensed from someone else. And then they bolted it into their existing processes, then added tires from one vendor, brakes from another, and a gear shift that sort of works. The result moves, sometimes impressively, but nobody in the organization fully understands the whole system, nobody designed it to be understood. And when something goes wrong there's no manual, no mechanic, and no one who knows the road.

The Rolls Royce gets invoked a lot in AI conversations, and it isn't entirely wrong. But it implies the problem is fragility, when the real problem is something different. The Rolls Royces still running a century later aren't running because they never broke; they're running because they were built for a world where dedicated mechanics, proprietary parts, and manufacturer relationships could be assumed. That world doesn't scale to 160,000 audit engagements, and it was never designed for the person who just needs to get to work.

Henry Ford understood something different. The Model T wasn't only cheaper, it was designed to be driven, understood, maintained, and adapted by the people who used it every day, so that the knowledge escaped the factory. Mechanics happened, drivers happened, and a whole vernacular ecosystem of expertise grew up around it: distributed, improvable by anyone who needed to get somewhere, not just the few who could afford dedicated support.

That is the gap enterprise AI has opened. They bought the engine, and they're missing the vehicle anyone can drive.

‍

What I watched happen

I was part of a team at Google Research working on exactly this problem: building systems designed to learn from the people who used them, through structured feedback loops, expert communities, and what we called adaptation via feedback. The premise was bidirectional: models adapt toward usefulness, the people operating them develop genuine understanding of the system, and the loop between them is where reliability actually lives.

Then the scaling bet took over, and once enough data and enough compute were on the table, the human teaching signal started looking like noise. The teams working on human-oriented adaptation got reorganized; the work didn't disappear, but the conviction that it was necessary did.

Here is what that bet gets wrong in enterprise contexts specifically: the feedback that trains general models is implicit, aggregate, and divorced from real work. What gets captured is clicks, session length, thumbs up or down. What doesn't get captured is the partner who quietly rewrote the output before a meeting, the manager who caught the error three steps downstream, or the auditor who knew the answer was wrong but couldn't articulate to the system why. That is the expert signal, the tacit knowledge that makes the difference between a system that works in a demo and one that works in a real audit, of a real company, with a real set of problems nobody anticipated.

That signal is evaporating into the air of every enterprise deployment right now, uncaptured, every single day. The scaling bet assumes someone is collecting it, and nobody is.

I watched the gap open: capability accelerated, while the understanding required to operate, adapt, and trust these systems in real environments did not keep pace. That gap is now visible in every enterprise AI deployment that behaves differently in production than it did in the lab. Nobody quite knows why, and nobody quite knows what to do about it.

‍

The loss they're already taking

The industry is still framing this as future risk, but it isn't future at all; it is present-tense, ongoing loss.

Every AI system running in production without this infrastructure is degrading right now, not dramatically, but quietly, through the accumulation of failures that aren't caught, aren't classified, and aren't routed to anyone who could fix them. The system that worked beautifully in the demo drifts from its environment, the environment changes and the system doesn't, and the people who need to drive it learn workarounds instead, because the real controls aren't available to them.

This is the modern car problem in another form. Vehicles got sealed, software-dependent, and proprietary. The right-to-repair movement exists because people recognized what was being lost: not just the ability to fix things, but the ability to understand what you own well enough to use it on your own terms. Enterprise AI is making the same mistake at much higher cost and much higher stakes.

There is a counterargument worth taking seriously, which goes: move fast, deploy, then tear out what doesn't work and optimize once you know what to keep. It's a philosophy that has worked in software before, but it has a hidden prerequisite: you need enough signal to know what to keep. At one development team working on a contained system, that signal is visible; at 160,000 audit engagements across 150 countries, it isn't. The signal lives in the heads of the professionals using the system every day. It's the workarounds they've developed, the judgment calls they've made, the edge cases they've quietly routed around, and it doesn't flow back to the central team automatically. It evaporates. You can only optimize what you can see, and right now almost nobody is building the infrastructure to see it.

Complex does not have to mean only-the-builder-can-operate-it. That is a design choice, not an inevitability.

‍

What the steering wheel looks like

There is a version of "solving this" that is actually just a different form of the same problem, and it goes like this: don't worry about the drivers, because the car drives itself. We've built a self-driving system, just tell it where you want to go.

That sounds reassuring until you think about what it assumes. Self-driving only works because of known roads and patterns, and audit is not a known road: every client is different, every year is different, the financial system changes, regulations change, and edge cases multiply. The entire value of having expert auditors is that they can navigate complexity and novelty no fixed system anticipated, so an AI that drives itself on a single road isn't a solution to that problem. It's a paternalistic restatement of it, dressed up as progress.

What's needed is not a self-driving car. What's needed is a vehicle that anyone who needs to use it can actually drive, understand, and adapt to wherever they need to go.

We're currently building monitoring and reliability infrastructure for one of the Big Four, a firm that made a different choice. They started by asking: how will we know when this is wrong, how will we fix it when it is, and how do we make sure the people who need to drive this system every day understand it well enough to do that? That isn't a slower approach; it's the only approach that produces something durable.

In practice, it means monitoring infrastructure that tells you not just whether the system ran but whether it did the right thing, triage logic that classifies failures by severity, repair processes that turn production failures into test cases, and verification before fixes ship. And, crucially, it means knowledge that transfers to the people who operate the system, so that understanding doesn't stay locked in the builders, and so that anyone who needs to drive it can.

This is what Brand means by maintenance as a philosophy rather than a task. Buildings that last aren't the ones that never need repair; they're the ones designed so that repair is possible, understandable, and something their occupants can do themselves.

The AI systems that will still be running well in ten years won't be the most sophisticated at deployment. They will be the ones whose operators learned to understand them, adapt them, and make them their own--not because they had dedicated support on retainer, but because the system was designed so that anyone who needed to use it could.

They bought the engine. Now let's build something their users can drive.

‍

Dr. Marisa Ferrara Boston is Managing Partner of Reins AI, which builds evaluation and reliability infrastructure for enterprise AI systems. She previously contributed to AI research at Google and served as Lead AI and Automation Architect at KPMG.

‍

Nobody Thought About the Drivers

They bought the engine. Then they bolted on whatever was lying around.

What I watched happen

The loss they're already taking

What the steering wheel looks like

Our other articles

The Intelligence Was in the Organization All Along

The Fitness Model for Agentic AI

Nobody Thought About the Drivers

Extending OpenTelemetry: A Proposed Repair Layer for Agentic AI

Reliability and Repair for Agentic Systems

Evaluating AI as Complex Systems: How Reins AI Bridges the Interpretability Gap

Measuring Cooperation in Human-Machine Teams: An Information-Theoretic Approach

Who’s Deploying AI, and Who’s Responsible for Safety?

Multi-Agent Framework Design: Planning, Control, and Agent Behavior