Skepticism on Specification-Driven Development

Introduction: A Familiar Argument

I've started seeing the term "Specification-Driven Development" on social media and tech articles. In the context of development with coding agents, claims like "finalize all specifications before development," "humans should only read and modify specs while coding agents read specs and modify code," and "humans shouldn't touch code" are being made.

The moment I saw these claims, I felt a strong sense of unease. Isn't this just Waterfall?

Looking back at the history of software development, we've transitioned from Waterfall to Agile and Extreme Programming (XP). The importance of feedback from implementation, the interaction between specification and implementation, continuous improvement—to ignore these insights and return to "write all specs first" seems like moving backwards through history.

What is Specification-Driven Development?

"Specification-Driven Development (SDD)" is a development methodology proposed by "Kiro," an integrated development environment released by Amazon Web Services (AWS) in July 2025. In September of the same year, GitHub announced "GitHub Spec Kit" based on similar principles, and in Japan, KDDI-affiliated companies began adoption.

Kiro's workflow consists of three stages: Requirements → Design → Implementation. It positions specifications as the "Source of Truth" and adopts a "documentation-first" approach. Specifications serve as a common language between humans and AI, from which code is generated.

The background to this methodology's attention is concern about "vibe coding." The approach of giving vague instructions to AI to generate code causes chaos in large codebases and raises concerns about quality and maintainability. Specification-driven development is often framed in a binary opposition: "chaotic AI usage" vs "specification-driven development".

Historical Irony: Why Did Waterfall Revive in 2025?

Looking back at software development history, the Waterfall model was mainstream for a long time. The sequential process of Requirements → Design → Implementation → Testing was an approach borrowed from construction and manufacturing.

However, this model had a fatal problem. Too many issues are only discovered during the implementation phase. Specifications may be ambiguous, technically infeasible, or misaligned with user needs—these problems only become apparent when you actually write code.

The reason Agile and XP flourished was precisely this point. There's a complex feedback loop between specification and implementation, and improving while going back and forth between them was understood to be the key to making good software.

Yet in 2025, a Waterfall-like workflow of "Requirements → Design → Implementation" has emerged again. This can only be described as historical irony.

Why Do People Repeatedly Fall for the Illusion That "Writing Specs First Will Solve Everything"?

I believe human psychology is behind this phenomenon.

Fear of complexity—Software development is inherently complex. Confronting that complexity means enduring uncertainty and chaos. The approach of "writing all specs first" provides the illusion that this uncertainty can be eliminated.

Craving for predictability—Project managers and stakeholders want predictable processes. The formula "once specs are finalized, implementation is just mechanical" appears very attractive.

Swinging from extreme to extreme—As a reaction to the chaos of "vibe coding," an escape to the order of "specification-driven development" occurs. However, the appropriate way to deal with complexity often lies in the middle ground.

Martin Fowler's Skepticism: Criticism from Authority

Interestingly, Martin Fowler's blog features a critical evaluation of specification-driven development by Birgitta Böckeler.

She actually tried Kiro and spec-kit and pointed out the following problems:

Excessive Complexity

For a small bug fix, Kiro generated 4 user stories and 16 acceptance criteria. She described this as "using a sledgehammer to crack a nut."

In other words, it's excessive for small tasks and insufficient for large ones—the applicable range is extremely narrow.

The Problem of Non-determinism

Even more fatal is the non-determinism of LLMs (Large Language Models). No matter how detailed the specifications, there's a possibility that the AI will generate different code each time it runs.

"Because of the non-deterministic nature of this technology, there will always remain a very non-negligible probability that it does things that we don't want"

This observation is essential. If reproducibility isn't guaranteed due to LLM non-determinism regardless of how perfect the specifications are, a feedback loop of examining and correcting the implementation becomes indispensable.

Past Failures: Model-Driven Development (MDD)

The article also mentions Model-Driven Development (MDD), a past attempt. MDD was also an approach of "automatically generating code from models (specifications)," but it never became widely adopted.

History repeats itself.

Digging Deeper into Non-determinism and Scale Issues

At this point, I also wondered, "Is it really impossible?" Could specification-driven development work with some ingenuity?

Full Generation vs Differential Migration

For example, instead of declaratively generating all code from specifications, what if we approached it as migration of existing code in response to specification differences? The AI performs migrations to reflect specification changes to the existing codebase.

This seems somewhat achievable.

However, problems quickly become apparent. What do you do when specification changes conflict with existing implementation structures? There's a risk of fatal structural contradictions, and ultimately human verification becomes essential.

What About Microservice Level?

What if we limit the scale? For example, at the granularity of a single service in an Event-Driven architecture—a sufficiently small, highly independent component with clear boundaries.

At this size, might specification-driven work?

But this is precisely Fowler's "scale problem." Excessive for tasks too small, insufficient for large ones. The applicable range is extremely narrow.

The psychology of wanting to think "but what if..."—this itself might be the desire for a silver bullet. The desire to escape complexity repeatedly leads us to the same illusion.

Essential Question: What Has Changed and What Hasn't?

The advent of coding agents has certainly changed something. But not everything has changed.

To answer this question, let's consider the developer's cognitive structure. Developer cognition consists of two models: Design Model (how things should be) and Implementation Model (understanding of current state). The "specification" in specification-driven development refers precisely to this Design Model.

The problem with specification-driven development is trying to perfectly construct the Design Model first and then reflect it all at once onto the Implementation Model. This means no feedback is obtained, the gap between both widens, and uncertainty is maximized—the very reason Waterfall failed.

What Has Changed

The cost of converting specification → implementation has dramatically decreased.

Implementation that previously took hours now completes in minutes. In other words, the cycle of reflecting the Design Model onto the Implementation Model has become blazingly fast.

Prototyping has accelerated and the trial-and-error cycle has shortened.

What Hasn't Changed

However, the following hasn't changed:

The difficulty of writing good specifications (Design Model) — Building an unambiguous, implementable model remains difficult
The value of feedback from implementation — Many problems only become apparent when you see actual working code (Implementation Model)
The difficulty of building domain models — Appropriately modeling complex domains is a problem that precedes writing code
LLM non-determinism as a new constraint — Rather, new uncertainty has been added
The structural problem that gaps will always exist between Design Model and Implementation Model

Essence: Rapid Iteration Between Design and Implementation

I believe the essence of coding agents is not "finalizing the Design Model first" but "being able to rapidly iterate between Design Model and Implementation Model."

Previously, it took time to update implementation (Implementation Model) after changing specifications (Design Model). That's why "finalizing specifications first" was emphasized.

But now, we can iterate between both as many times as we want at high speed. That's why an approach of co-evolving both while running the feedback loop has become possible.

Reducing the gap to zero is fundamentally impossible (new feedback constantly comes from outside, and the Design Model moves ahead). What's important is keeping the gap small and manageable.

The cognitive structure of "Design Model" and "Implementation Model" used in this article is explored in more detail in a separate essay.

DDD and Context Constraints

From a Domain-Driven Design (DDD) perspective, there are also questions about specification-driven development. The Ubiquitous Language that Eric Evans emphasized is shared between domain experts, developers, and code. This means specifications and code are not separated.

When both codebase and documentation are maintenance targets, a source of cognitive load emerges: "which one is correct?" From this perspective, the Source of Truth should be limited to the codebase. Design proposals and ADRs are kept as "records of consideration," but by making it clear that they are not ongoing maintenance targets, we can prevent the increase in differential load.

The same applies to dialogue with coding agents. LLMs have context size constraints, and having them read both specifications and code wastes context. If the codebase expresses sufficient structure and intent, the AI can understand the Implementation Model by reading the codebase.

The Third Way: Spec-Implementation Co-evolution Approach

There is a third way that is neither "vibe coding" nor "specification-driven development." The approach of rapidly iterating between specification and implementation.

For example, when adding authentication features. In a Waterfall approach, you would design "OAuth 2.0, JWT, refresh tokens, session management..." all upfront. But with spec-implementation co-evolution, you start with a minimal Design Model: "Users can log in with email and password." Have the coding agent implement it, look at the generated code, and realize "Token storage in localStorage? But shouldn't it be httpOnly cookies for XSS protection?" Update the Design Model and reimplement. By rapidly cycling through this, you gradually refine while keeping gaps small.

Without feedback from the Implementation Model, you cannot write a good Design Model.

Conclusion: Feedback Loops Are the Essence

What proponents of specification-driven development overlook is that complexity exists precisely in the interaction between specification and implementation. Complexity doesn't disappear by detailing specifications.

There is no silver bullet. This truth that Fred Brooks pointed out in 1986 remains unchanged in 2025. Having acquired the powerful tool of coding agents, we want to jump at the illusion that "writing specs will automatically produce code." But what history teaches us is that swinging from extreme to extreme fails.

The transition from Waterfall to Agile, and this specification-driven development debate—what these share is the importance of feedback loops. Receive feedback from implementation to improve specifications, reflect specification intent into implementation, and receive feedback again. This circulation is the key to creating good software.