concepts

Constraints and Bottlenecks

Bottlenecks limit the overall performance of the system and determine where improvements actually matter.

technologyorganization·3 min read

What is this?

Bottlenecks limit the overall performance of the system and determine where improvements actually matter.

Why it matters

Use this concept to explain observable behavior structurally rather than merely naming it.

Next step

Next, check which archetype or diagnostic method makes the pattern visible in the concrete system.

~3 min read

Hero image for Constraints and Bottlenecks

Definition

Constraints and bottlenecks are the limiting factors in a system. Every system has at least one bottleneck that determines its maximum throughput or performance. The Theory of Constraints says that every optimization performed anywhere other than the bottleneck is largely an illusion. It does not increase total system throughput and often only builds up queues and work in progress.

System Mechanism

The flow of work or data through a system is governed by its slowest station, the bottleneck. If you speed up the stations before it, work piles up in front of the bottleneck. If you speed up the stations after it, they sit idle. Improvements therefore create system-level impact only when they target the current bottleneck directly.

Architecture Example

An API landscape delivers 500 ms latency per request. Developers optimize feature-rich Service A from 100 ms down to 10 ms, which is a huge percentage gain. Yet total latency barely drops. Why? The true architectural bottleneck is an outdated authentication service that Service A must still wait for, and it still takes 400 ms. The investment was made in the wrong place, so the overall architecture gained almost nothing.

Organizational Example

A company hires ten new frontend developers to accelerate feature delivery. But there is still only one DevOps engineer who can deploy to production, and that role is the bottleneck. The result is that development speed before the bottleneck rises sharply, creating huge backlogs in Git and untested integration branches. Delivery to customers remains exactly as slow as before, but everyone becomes more stressed.

Diagnostic Questions

1.Where is work piling up in the system, whether as tickets, pull requests, or queued data?

2.Where are stations or people permanently overloaded while dependent stations frequently wait for input?

3.Are we really optimizing the primary bottleneck, or are we just making a convenient local component faster?

Diagram

Why This Concept Helps in Architecture

Once a bottleneck has been identified, such as a slow legacy component, the entire architecture and organization should be aligned around not overloading it. Only after that subordination works should you expand the bottleneck, for example through serious refactoring or caching. Once one bottleneck is removed, another will appear elsewhere in the system.

How to Distinguish It from Similar Topics

Unlike *feedback loops*, which describe cyclical system behavior, bottleneck theory focuses on flow and throughput limits in linear or weakly branched chains of work.

How to Use the Concept in Practice

Use Goldratt's five focusing steps: find the bottleneck, exploit it, subordinate everything else to its pace, elevate it, and once it has moved, start over at step one. That sequence keeps the system focused on throughput instead of local heroics.

First Implementation Steps

Use flow metrics from Kanban or continuous delivery, such as lead time and cycle time, to make bottlenecks visible in an unambiguous way. Do not let local resource utilization fool you. Heroic effort at the wrong point harms the overall flow.