The High Cost of Running Lean

A manufacturing VP sits in a board meeting and hears what she’s been told for months: “We’re running lean. Utilization is at 96%. Everyone is busy.” Six months later, she’s frustrated. The product roadmap is six weeks behind. Quality issues are rising. Her best people are burning out, and she can’t figure out why. The dashboard says “Green.” Her gut says something is broken.

Most executives assume that if everyone is working hard, something productive is happening. You see packed calendars, Slack notifications flying past, tickets closing, campaigns launching. You see Jira boards full of “in-progress” work. The dashboard says “Green” on Utilization. Everyone is at 95%+ capacity.

And yet, the product roadmap slips. The campaign that should have launched three weeks ago is still in review. The decision “aligned on” last month is somehow being relitigated. The code that shipped last week needs a hot-fix. Something feels stuck, even though the activity data suggests everyone is moving at full speed.

The problem isn’t effort. You have optimized the system for busyness, not for speed. You have built a highway so full of cars that traffic has stopped moving.

The Math of Gridlock

This isn’t opinion. It’s queueing theory, backed by decades of operations research.

At 70% capacity utilization, a system functions. Work flows through. You can absorb a new request without breaking the pipeline. Someone gets sick, and the team compensates. A bug surfaces mid-sprint, and you have space to handle it without cascading delays.

At 90% utilization, everything becomes fragile [1]. Every task now has to wait. Variability that was once absorbed by slack becomes a bottleneck. A two-hour job that depends on someone else doesn’t take two hours to complete; it takes two weeks, because it sits in a queue for five days waiting for that person to have two free hours of consecutive focus.

At 99% utilization—which is where most knowledge work organizations sit—you are mathematically guaranteed gridlock [1]. A task that should take two hours spends 95% of its life waiting. The work is being “done.” The value is not moving.

Kingman’s Formula from queueing theory makes this precise. As utilization approaches 100%, wait times increase exponentially, not linearly [1]. The last 5% of capacity compression doesn’t slow you down by 5%. It slows you down by half. This is not a management problem. It’s a physics problem.

Google’s DevOps Research and Assessment (DORA) team measured this in software engineering. They tracked what separates “low performers” from “elite performers” across thousands of teams [2]. Elite performers—those deploying multiple times per day with lead times under one hour—operate with deliberate spare capacity [2]. They have time for code review, architectural thinking, and responding to emergencies. Low performers, who deploy monthly with lead times of weeks, are running so hot that they have no time for anything but the next ticket [2].

The difference in deployment lead time is measured in weeks, not hours. The difference in team utilization is 15–20 percentage points.

The False Efficiency

Most organizations measure Utilization: “Is everyone busy?” This logic comes from manufacturing, where an idle machine is lost money. A CNC mill sitting idle generates zero revenue, so you want it running 24/7.

Knowledge work is not manufacturing. A developer sitting idle for two hours is not loss. If that idle time means she is available the moment the code review is done—so the next task doesn’t queue for a week—then those two hours just collapsed the cycle time of the entire pipeline [3].

The metric you should measure is Flow Efficiency: “What percentage of the work’s lifetime is it actually being worked on, versus waiting?” In most organizations, this number is brutal [3]. A task that spends two weeks “in progress” is probably being actively worked on for 2–4 hours of that time. The other 330 hours are spent waiting for approvals, feedback, or someone’s attention to become available.

Research from Asana on coordination overhead found that knowledge workers now spend 60% of their time on “work about work”—meetings, status updates, task-switching, approvals—rather than the work itself [4]. At 95%+ utilization, that coordination overhead becomes the bottleneck. You can’t absorb a new request because there’s nowhere to put it. You can’t take time for deep work because your calendar is packed. You can’t think strategically because your day is consumed by keeping current work from falling apart.

At 100% utilization, you are not being “efficient.” You are just making the work invisible while the system decays.

Three Costs of Running Hot

Context Switching and the “Flow Tax.” A developer assigned to five projects because “we want to keep her billable” is not five times as productive. She pays a “switch tax” on every transition—refocusing, re-reading code from days ago, remembering where she left off in three different architectural decisions.

Asana’s 2025 Anatomy of Work research finds that knowledge workers now switch contexts 20–30 times per day, with the average worker spending 60% of their time on coordination rather than execution [4]. Each switch costs 15–25 minutes of re-orientation. A developer working on seven concurrent projects doesn’t complete them in parallel; she just pays the switch tax seven times.

No Time to Think—No Innovation. Innovation requires slack. It requires an afternoon to understand why a system keeps breaking instead of just patching it again. It requires time to read the new documentation, to experiment with a new approach, to sit with a problem long enough to see the root cause instead of the symptom.

Harvard Business School’s research on slack time and innovation found a causal link: when employees had periods of lower utilization, they generated significantly more and higher-quality ideas [5]. When utilization was maximized, execution continued. Breakthrough thinking stopped. A company running at 95% utilization is optimizing for output while systematically erasing the conditions for innovation.

Technical Debt Compounds Invisibly. When there is no time to fix root causes, you patch symptoms. You write band-aid code. You skip the refactoring. You defer the security update. At 100% utilization, this is rational—there is literally no time to do it right. So the debt accumulates, unseen.

Two years later, you find that half your cycle time is being spent working around the “quick fixes” you made when you were too busy to do it right. A 2-hour task now takes 4 hours, not because the work got harder, but because you’re constantly navigating around the scaffolding of prior shortcuts. That’s technical debt. It’s invisible in the moment. It compounds over time. And it’s entirely preventable.

The High Performers Run Cooler

This is where GitLab and similar high-velocity organizations diverge from the pack.

GitLab operates as a fully remote, asynchronous-first company with 1,600+ employees distributed across 65 countries [6]. On paper, this should be chaos. In practice, GitLab deploys hundreds of times per day and maintains lead times measured in hours, not weeks [6]. How? Not because they work harder. Because they explicitly designed for slack.

GitLab’s handbook is their operating system. Decision-making is documented. Work is structured to move forward asynchronously, not to fill meetings [6]. This requires spare capacity—time for people to read the documentation, think through their decision, and comment asynchronously. It requires that not everyone’s calendar is fully booked.

Compare this to a “high-utilization” organization where every decision requires a meeting because the context is only in people’s heads. Where async communication fails because no one has time to write clearly or read thoroughly. Where work queues up at every handoff because the next person is booked solid.

Google’s Project Aristotle research identified a critical finding: psychological safety—the belief that you can take interpersonal risks without punishment—was the single strongest predictor of team effectiveness [7]. But psychological safety cannot exist in a 100%-utilized system. When people are drowning in work, they prioritize survival over speaking up. They stop flagging concerns. They stop suggesting better approaches. They stop taking risks.

Teams with higher psychological safety (which correlates with lower utilization and more spare time) reported 43% higher performance variance, 19% higher productivity, and 31% more innovation compared to teams running hot [7].

The Green Dashboard Lie

Your utilization dashboard is green because it measures activity, not outcome.

Activity: “The team closed 200 tickets this week.” (Green!)
Reality: “Lead time for a feature is 35 days.” (Red.)
Activity: “Utilization is 97%.” (Green!)
Reality: “Flow Efficiency is 9%.” (Red. Work exists for 25 days but is actively worked on for 2–3 of them.)
Activity: “Everyone is busy.” (Green!)
Reality: “Three people have left in the last six months. Burnout is rising. Two major projects are late.” (Red.)

The dashboard tells you the inputs are being consumed. It doesn’t tell you whether the system is delivering value. In fact, in a fully-loaded system, high utilization is often inversely correlated with speed. The two highest-utilization organizations may also be the two slowest.

This is why a manufacturing company can “run lean” and still miss deadlines. Why a software team can ship “constantly” but never finish the big things. Why a marketing organization can produce more output while fewer campaigns actually move the needle.

You are measuring the wrong thing.

What Now

If you suspect your organization is caught in the Utilization Trap, the fix requires counter-intuitive leadership. You have to be willing to see “idle” capacity and not panic. You have to trust the math instead of the activity.

Cap utilization at 80% explicitly. Make it a formal policy. Plan capacity at 80%. The remaining 20% is not “time off”—it is your buffer for variance, your capacity for urgent requests, and your space for thinking, learning, and innovation. If you are consistently running above 80%, you are not lean; you are brittle. Measure it monthly by team. If a team is at 95%, do not assign new work until utilization drops.

Measure lead time and cycle time, not just velocity. Velocity tells you how many items closed. Lead time tells you how long it took from start to finish. If velocity is rising but lead time is lengthening, you are churning—completing small work while large value stalls in queues. Make lead time the primary metric for health. When a feature request sits in a queue for two weeks, that is a signal that something in the system is broken.

Limit work in progress ruthlessly. Stop starting new things. The fastest way to increase speed is to reduce the number of active projects. When an organization is working on seven concurrent strategic initiatives, each one is competing with six others for attention and mental space. Limit WIP to a hard number—perhaps 3 concurrent initiatives per function. Force the organization to finish work before starting new work. DORA research shows this single change can reduce lead time by 50% or more [2].

Shift the diagnostic questions in your business reviews. Instead of “Are we at capacity?”, ask “What is waiting? Where is the work stuck? Who is it blocked on? What is the longest queue?” The bottleneck is almost never a lack of effort. It is almost always a queue created by a system running too hot.

Measure and protect psychological safety. Ask your teams: “Do you feel safe to speak up about problems? To suggest a different approach? To say no?” Teams with higher psychological safety recover faster from failure, innovate more, and actually move faster [7]. You cannot buy this. You can only build it by creating conditions where people have space to think, time to speak, and safety to be wrong.

The job of a leader is not to keep people busy. It is to move value. Sometimes, the fastest way to get there is to slow down, clear the highway, and let the traffic actually move.