Migration of Software Development Bottlenecks: From Compute to Cognition to AI
Introduction: Bottlenecks Move
The fundamental insight of the Theory of Constraints is simple: every system has exactly one bottleneck, and when you resolve it, the next bottleneck emerges. Bottlenecks don't disappear—they migrate.
The history of software development can be reread as a history of this bottleneck migration. And when we look at where that migration has brought us today, we find that humans and AI face remarkably similar kinds of constraints.
When Compute Was the Bottleneck
From the 1960s through the 1990s, the central constraint of software development was physical computer resources. Memory was expensive, CPU cycles were precious, and storage was limited.
The primary concern of developers in this era was "how to make things work with fewer resources." Choosing data structures and algorithms meant optimizing for computational complexity and memory usage. The difference between bubble sort and quicksort wasn't an academic curiosity—it was a matter of whether your program would finish in a practical timeframe.
The practices born in this era were management strategies for compute as a finite resource:
- Algorithmic complexity analysis (Big-O notation) — How to allocate finite CPU time
- Data structure optimization — How to structure finite memory
- Caching strategies — How to reduce access to slow storage
- Database normalization and index design — How to optimize finite storage and I/O
What these practices share is a common thinking pattern: designing what to keep, what to discard, and how to structure things within finite resources. This pattern reappears in every subsequent era.
When Cognition Became the Bottleneck
From the 2000s onward, the landscape shifted dramatically. Exponential growth in computing power driven by Moore's Law, the rise of cloud computing, and the dramatic decline in storage costs. For most software, compute resources were no longer the primary constraint.
Yet productivity didn't scale proportionally with compute improvements. As Fred Brooks pointed out in "No Silver Bullet" (1986), the essential complexity of software development doesn't vanish with technological progress. With the compute bottleneck resolved, the next bottleneck emerged: human cognitive capacity.
Human working memory can hold only about 4 plus or minus 1 chunks. No matter how fast computers get, it's still humans who design, understand, and modify systems. When codebases ballooned to millions of lines, the bottleneck shifted from "can the computer run this?" to "can a human understand this?"
The practices born in this era were management strategies for cognition as a finite resource:
- Modularization — Divide systems into cognitively manageable units. Limit the scope that must be understood at once
- Domain-Driven Design — Reduce translation overhead through Ubiquitous Language. Minimize the conceptual conversion cost between domain and code
- Team Topologies — Design team structure based on cognitive load. Leverage Conway's Law in reverse, splitting teams along cognitively manageable boundaries
- Separation of concerns, interface design — Explicitly define "what you don't need to know"
What's notable here is that these practices share an isomorphic structure with compute-era practices:
| Compute Resource Management | Cognitive Resource Management | |
|---|---|---|
| Partitioning | Memory space partitioning, paging | Modularization, separation of concerns |
| Retention & Eviction | Cache strategy (what to keep in memory) | Abstraction (what to push out of awareness) |
| Structuring | Data structure design | Domain model design |
| Locality of access | Memory access locality | Cohesion and coupling |
Both are fundamentally tackling the same problem: what to keep, what to discard, and how to structure things within finite resources. The only change is that the subject shifted from CPU cycles and memory to working memory and attention.
The Isomorphic Constraint in AI
Now, AI has entered software development. AI is dramatically easing the "writing code" bottleneck. But AI, too, cannot escape finite resource constraints.
The context window of LLMs has a remarkably similar constraint structure to human working memory:
| Working Memory | Context Window | |
|---|---|---|
| Capacity | ~4±1 chunks | Thousands to millions of tokens |
| Behavior under pressure | Increased error rate, degraded judgment | Degraded response quality, increased hallucination |
| Forgetting | Information not attended to fades | Information beyond the window is lost |
| Chunking | Grouping information into meaningful units to save capacity | Summarization and compression to increase effective information density |
| Externalization | Recording in notes and documents to free the mind | Saving to file systems and databases |
The quantitative scale differs, but the structure of the constraint is isomorphic. Both face the same problem: "within finite processing capacity, which information to retain, which to externalize, and how to structure it all."
What this similarity suggests is that practices developed in the cognitive resource era can be directly applied to AI context management:
- Modularization → Context isolation: Give sub-agents independent context windows, returning only result summaries. Just as humans don't need to know a module's internal implementation, a parent agent doesn't need to know a sub-agent's details
- Ubiquitous Language → Shared vocabulary design: Reduce conceptual drift between humans and AI, or between AI agents, to minimize translation overhead
- Team Topologies → Multi-agent design: Partition agent responsibilities based on context load
- Progressive disclosure → Load only needed information: Instead of stuffing everything into the context, retrieve information on demand
Finite Resource Management: The Invariant Problem Structure
Looking back at the discussion so far, we can see that the same problem structure has repeatedly emerged throughout the history of software development:
Finite resource → Development of management strategies → Resolution of bottleneck → Emergence of the next finite resource
Algorithms and data structures arose for compute constraints. Modularization and domain design arose for cognitive constraints. And isomorphic management strategies are being applied for context window constraints.
This means that software development has essentially been working on the same question throughout: "how to manage complexity within finite resources." The type of resource changes, but the structure of the question remains constant.
And this perspective has practical implications. When thinking about AI tool design and collaboration methods with AI, there's no need to invent a new paradigm from scratch. The decades of accumulated software engineering practices can be applied almost directly just by substituting the target resource. What we've learned about managing cognitive load also applies to managing context windows. This is, I think, good news for engineers.
The Question Beyond: What Happens When Context Constraints Disappear?
One caveat, however.
Just as compute constraints were (effectively) resolved, context window constraints may eventually be dramatically relaxed. When windows expand to effectively infinite size and AI can simultaneously process arbitrary amounts of information, what becomes the next bottleneck?
The experience of the compute era teaches us that resolving a constraint doesn't resolve the problem—it moves it. Even with abundant compute, software didn't get better. The next bottleneck—human cognitive capacity—simply emerged.
Similarly, even with infinite context windows, software alone won't improve. Beyond that lie questions that no amount of quantitative capacity can solve: "what should we build?", "whose needs should we serve?", and "how should we design shared concepts?"
Related Articles
- The "Specification" Problem in the AI Era: Translation Asymmetry and the Location of SSoT — The translation problem between intent and implementation
- The True Nature of "Specifications": Understanding Software Through Requirements, Contract, and Structure — The multi-layered structure of specifications
