Constraints and Bottlenecks
Bottlenecks limit the overall performance of the system and determine where improvements actually matter.
What is this?
Bottlenecks limit the overall performance of the system and determine where improvements actually matter.
Why it matters
Use this concept to explain observable behavior structurally rather than merely naming it.
Next step
Next, check which archetype or diagnostic method makes the pattern visible in the concrete system.

Definition
Constraints and bottlenecks are the limiting factors in a system. Every system has at least one bottleneck that determines its maximum throughput or performance. The Theory of Constraints says that every optimization performed anywhere other than the bottleneck is largely an illusion. It does not increase total system throughput and often only builds up queues and work in progress.
System Mechanism
The flow of work or data through a system is governed by its slowest station, the bottleneck. If you speed up the stations before it, work piles up in front of the bottleneck. If you speed up the stations after it, they sit idle. Improvements therefore create system-level impact only when they target the current bottleneck directly.
Architecture Example
An API landscape delivers 500 ms latency per request. Developers optimize feature-rich Service A from 100 ms down to 10 ms, which is a huge percentage gain. Yet total latency barely drops. Why? The true architectural bottleneck is an outdated authentication service that Service A must still wait for, and it still takes 400 ms. The investment was made in the wrong place, so the overall architecture gained almost nothing.
Organizational Example
A company hires ten new frontend developers to accelerate feature delivery. But there is still only one DevOps engineer who can deploy to production, and that role is the bottleneck. The result is that development speed before the bottleneck rises sharply, creating huge backlogs in Git and untested integration branches. Delivery to customers remains exactly as slow as before, but everyone becomes more stressed.
Diagnostic Questions
1.Where is work piling up in the system, whether as tickets, pull requests, or queued data?
2.Where are stations or people permanently overloaded while dependent stations frequently wait for input?
3.Are we really optimizing the primary bottleneck, or are we just making a convenient local component faster?
Diagram
Why This Concept Helps in Architecture
Once a bottleneck has been identified, such as a slow legacy component, the entire architecture and organization should be aligned around not overloading it. Only after that subordination works should you expand the bottleneck, for example through serious refactoring or caching. Once one bottleneck is removed, another will appear elsewhere in the system.
How to Distinguish It from Similar Topics
Unlike *feedback loops*, which describe cyclical system behavior, bottleneck theory focuses on flow and throughput limits in linear or weakly branched chains of work.
How to Use the Concept in Practice
Use Goldratt's five focusing steps: find the bottleneck, exploit it, subordinate everything else to its pace, elevate it, and once it has moved, start over at step one. That sequence keeps the system focused on throughput instead of local heroics.
First Implementation Steps
Use flow metrics from Kanban or continuous delivery, such as lead time and cycle time, to make bottlenecks visible in an unambiguous way. Do not let local resource utilization fool you. Heroic effort at the wrong point harms the overall flow.
How You Recognize Impact
Do we know where today's bottleneck actually is, and are performance budgets being invested specifically in removing that bottleneck?
Sources
Eliyahu Goldratt — The Goal (North River Press, 1984)
Authors & Books
Go to referencesRelevant references for Constraints and Bottlenecks.
Concept Visual
Constraints and Bottlenecks: Bottlenecks limit the flow in the entire system.