Deep Dive into the Three AI Programming Frameworks: The Right Way to Do Specification-Driven Development

Why Does AI Always Go Off the Rails When Writing Code?

What's truly separated the best from the rest in AI programming over the past two years isn't who writes fancier prompts — it's whether you do one critical thing right before typing the first line of code: establish specifications.

Many developers' daily routine looks like this: tell the AI "build me a login feature" and expect it to read your mind. But you never specified the tech stack, which module it should integrate with, or which existing code is off-limits. The AI can only guess — if it guesses right, you're golden; if it guesses wrong, you spend even more time reworking.

The root problem boils down to two things:

You didn't communicate requirements clearly — The AI lacks project context and can only "fill in the blanks" based on training data
AI has no persistent memory — By the tenth message, it's already forgotten the rules established in the first, and starts doing its own thing

Why does AI inherently lack project context? This goes back to how large language models work. LLMs are built on the Transformer architecture, processing input token sequences through attention mechanisms. Their "understanding" depends entirely on information within the current context window. Even models with 128K or larger context windows cannot automatically perceive your project's directory structure, existing database schema, team coding conventions, or historical decisions. Their training data covers massive amounts of open-source code, but these represent statistical patterns rather than understanding of your specific project. That's why the same prompt "build me a login feature" might yield a JWT solution, a Session solution, or an OAuth solution — the AI selects the most probable output in probability space, not the most suitable one for your project.

The "no persistent memory" problem stems from the fact that current mainstream LLM conversation mechanisms are fundamentally stateless. With each API call, the model needs to reprocess the complete conversation history as input. As conversation turns increase, earlier information gets progressively diluted in attention weight distribution — this is the so-called "Lost in the Middle" phenomenon. Research shows that once context exceeds a certain length, the model's recall rate for information in middle positions drops significantly. While technologies like RAG (Retrieval-Augmented Generation) and vector databases can partially mitigate this, they introduce additional engineering complexity, and retrieval quality directly impacts generation quality.

These two problems together point to one core conclusion: You can no longer just chat with AI about requirements — you need to write the rules as documents and let the documents keep the AI in check. Writing specifications as structured documents essentially replaces uncertain memory recall with deterministic text injection.

What Is Specification-Driven Development (SDD)

This approach has a formal name — Specification-Driven Development (SDD). In one sentence: establish the rules first, then write the code.

In traditional development, we also talk about requirements documents and technical proposals. But in the era of AI-collaborative programming, these documents have shifted from being "read by humans" to being "executed by AI." Documentation is no longer a formality — it's the AI's behavioral guidelines and execution boundaries.

SDD didn't appear out of thin air; it inherits decades of software engineering methodology. From Royce's waterfall model in the 1970s emphasizing requirements specification documents, to Design by Contract in the 1990s requiring preconditions and postconditions to constrain function behavior, to TDD (Test-Driven Development) using test cases as executable specifications — the philosophy of "define behavioral boundaries first, then implement functionality" runs through all of them. What makes SDD unique is that it extends the consumers of these specifications from human developers to AI agents. The format, granularity, and expression of documents need to be optimized for LLM comprehension characteristics — for example, using structured YAML/Markdown templates, explicit positive/negative constraint lists, and machine-parseable acceptance criteria.

Currently, there are three mainstream frameworks for implementing SDD: OpenSpec, Spec Kit, and Super Powers. Many people get confused when they find these three names, thinking they're competing for the same territory. In reality, each handles a different phase and they can be used together.

What Each Framework Handles

Blueprint Phase Framework: Defining Boundaries and Constraints

The first framework's core responsibility is: forcing you to think through the blueprint before AI starts writing code.

It requires you to clearly answer:

What will and won't be done this time
Where the technical boundaries are
Which constraints must never be violated

Only after the blueprint passes review can the AI begin work. This is like an architect completing construction drawings before workers can enter the site. Building without drawings almost guarantees rework.

From a technical implementation perspective, the blueprint phase framework typically contains several key components: constraint declaration files (defining hard constraints like tech stack, dependency versions, architectural patterns), scope boundary files (using a three-way split of "must do / must not do / can do later" to clarify requirement boundaries), and validation rules (automatically checking whether the AI's plan violates constraints before it generates code). This is similar to TypeScript's type system — you pre-declare the "shape," and any output that doesn't match gets intercepted. In practice, these constraint files are typically stored in .md or .yaml format in a specific folder at the project root, and AI programming tools (like Cursor, Windsurf, etc.) automatically read these files as part of the system prompt.

Execution Flow Framework: Step-by-Step Execution with Checkpoint Confirmation

The second framework manages the entire process from initiation, design, implementation to delivery.

Its core mechanism: every step has a checkpoint, and every step forces a pause for your confirmation. If you don't give the green light, the AI doesn't proceed.

This solves the problem of AI "writing a massive chunk of code in one go." With process controls, the AI takes two steps then pauses. Mistakes get caught on the spot, rather than having to sift through 500 lines of generated code after the fact.

This "checkpoint" mechanism essentially implements the Human-in-the-Loop model in human-AI collaboration. This concept originates from automation control theory — retaining human approval authority at critical decision nodes, leveraging AI's efficient execution capability while preventing cascading error amplification. In AI programming scenarios specifically, checkpoints are typically set at: after architecture design completion, after core interface definition, after each independent module implementation, and before integration testing. Each checkpoint contains three elements: a description of what the AI needs to deliver, a checklist of standards for human verification, and branching workflows for pass/fail outcomes. This mechanism downgrades the AI from an "autonomous agent" to a "supervised executor," dramatically reducing the risk of losing control.

Change Record Framework: Incremental Tracking and Issue Tracing

The third framework manages traceability of incremental changes.

Add a feature today, modify a requirement tomorrow — it records every change like Git commits. When something breaks later, you can trace back and find exactly which change planted the landmine.

This is extremely important in real projects — requirements always change, and the key is whether changes can be traced afterward.

It's worth noting that while the change record framework is often compared to Git, the dimensions it tracks are fundamentally different. Git records text differences (diffs) in code files, while the change record framework tracks semantic changes at the requirements level — why something changed, what business logic was modified, and which modules' behavioral contracts were affected. This is closer to "Audit Logs" or Event Sourcing patterns in the database domain. When AI modifies code based on new requirements, it must not only commit code changes but also synchronously update relevant sections in specification documents and record the motivation, impact scope, and rollback plan in the change log. This dual-track recording means issue tracing no longer requires reading code diffs line by line — you can quickly locate problems from the business semantics level.

How the Three Frameworks Work Together

These three frameworks aren't pick-one-of-three — they each handle a different phase and are meant to be used together:

Phase	Responsibility	Analogy
Blueprint Phase	Define boundaries and constraints	Architect
Execution Phase	Step-by-step execution + checkpoint confirmation	Construction process
Change Phase	Record every modification	Change ledger

For new projects starting from scratch, all three can be used in sequence; for adding features or refactoring existing projects, you might focus more on the latter two. The key is choosing the appropriate framework based on your project's current state.

Practical Implementation Advice

Framework Selection for Different Scenarios

New projects from scratch: Start with the blueprint framework to clarify boundaries, then use the flow framework for step-by-step progress
Adding features to existing projects: Focus on the change record framework to ensure additions don't break existing logic
Refactoring legacy projects: All three are needed, but pay extra attention to the "what not to do" list in the blueprint phase

Core Principles of Specification-Driven Development

Honestly, you don't necessarily need to use all three frameworks heavily. What's more important is understanding what each one handles and where they don't conflict. Even if you prefer maintaining documentation specifications manually, understanding their division of labor gives you the confidence to choose tools later.

Rather than rushing to adopt any single framework, building the "specifications first" mindset is what truly matters. Regardless of which tool you ultimately use, the core logic is the same: make AI work within clear boundaries, pause at critical nodes for your confirmation, and leave a traceable record of every change.

Summary

Looking back at this article, three things should now be clear:

The root cause of AI programming going off the rails is lack of specification constraints, not insufficient prompt engineering skills
The three frameworks handle blueprints, processes, and changes respectively — each with its own role, no conflicts
Choosing frameworks based on your project's current phase is more pragmatic than blindly adopting the full suite

The essence of specification-driven development is externalizing your understanding of the project into documents that AI can execute. This doesn't add workload — it's about thinking through things that should have been thought through anyway, just earlier. The difference is that before, you could get away with being vague; now, AI will faithfully amplify your vagueness a hundredfold.

Why Does AI Always Go Off the Rails When Writing Code?

The root problem boils down to two things:

You didn't communicate requirements clearly — The AI lacks project context and can only "fill in the blanks" based on training data
AI has no persistent memory — By the tenth message, it's already forgotten the rules established in the first, and starts doing its own thing

What Is Specification-Driven Development (SDD)

This approach has a formal name — Specification-Driven Development (SDD). In one sentence: establish the rules first, then write the code.

What Each Framework Handles

Blueprint Phase Framework: Defining Boundaries and Constraints

The first framework's core responsibility is: forcing you to think through the blueprint before AI starts writing code.

It requires you to clearly answer:

What will and won't be done this time
Where the technical boundaries are
Which constraints must never be violated

Execution Flow Framework: Step-by-Step Execution with Checkpoint Confirmation

The second framework manages the entire process from initiation, design, implementation to delivery.

Its core mechanism: every step has a checkpoint, and every step forces a pause for your confirmation. If you don't give the green light, the AI doesn't proceed.

Change Record Framework: Incremental Tracking and Issue Tracing

The third framework manages traceability of incremental changes.

Add a feature today, modify a requirement tomorrow — it records every change like Git commits. When something breaks later, you can trace back and find exactly which change planted the landmine.

This is extremely important in real projects — requirements always change, and the key is whether changes can be traced afterward.

How the Three Frameworks Work Together

These three frameworks aren't pick-one-of-three — they each handle a different phase and are meant to be used together:

Phase	Responsibility	Analogy
Blueprint Phase	Define boundaries and constraints	Architect
Execution Phase	Step-by-step execution + checkpoint confirmation	Construction process
Change Phase	Record every modification	Change ledger

Practical Implementation Advice

Framework Selection for Different Scenarios

New projects from scratch: Start with the blueprint framework to clarify boundaries, then use the flow framework for step-by-step progress
Adding features to existing projects: Focus on the change record framework to ensure additions don't break existing logic
Refactoring legacy projects: All three are needed, but pay extra attention to the "what not to do" list in the blueprint phase

Core Principles of Specification-Driven Development

Summary

Looking back at this article, three things should now be clear:

The root cause of AI programming going off the rails is lack of specification constraints, not insufficient prompt engineering skills
The three frameworks handle blueprints, processes, and changes respectively — each with its own role, no conflicts
Choosing frameworks based on your project's current phase is more pragmatic than blindly adopting the full suite

Deep Dive into the Three AI Programming Frameworks: The Right Way to Do Specification-Driven Development

Why Does AI Always Go Off the Rails When Writing Code?

What Is Specification-Driven Development (SDD)

What Each Framework Handles

Blueprint Phase Framework: Defining Boundaries and Constraints

Execution Flow Framework: Step-by-Step Execution with Checkpoint Confirmation

Change Record Framework: Incremental Tracking and Issue Tracing

How the Three Frameworks Work Together

Practical Implementation Advice

Framework Selection for Different Scenarios

Core Principles of Specification-Driven Development

Summary

Related articles

Claude Code for Test Development in Practice: An AI Programming Workflow That Doubles Your Efficiency

Hermes Agent Hands-On Review: An AI Efficiency Revolution for Indie Game Developers

Vibe Coding Beginner's Guide: Tool Selection Across Three Categories with Practical Examples

Deep Dive into the Three AI Programming Frameworks: The Right Way to Do Specification-Driven Development

Why Does AI Always Go Off the Rails When Writing Code?

What Is Specification-Driven Development (SDD)

What Each Framework Handles

Blueprint Phase Framework: Defining Boundaries and Constraints

Execution Flow Framework: Step-by-Step Execution with Checkpoint Confirmation

Change Record Framework: Incremental Tracking and Issue Tracing

How the Three Frameworks Work Together

Practical Implementation Advice

Framework Selection for Different Scenarios

Core Principles of Specification-Driven Development

Summary

Related articles

Claude Code for Test Development in Practice: An AI Programming Workflow That Doubles Your Efficiency

Hermes Agent Hands-On Review: An AI Efficiency Revolution for Indie Game Developers

Vibe Coding Beginner's Guide: Tool Selection Across Three Categories with Practical Examples

Related articles

2026年6月8日·1 min
Claude Code for Test Development in Practice: An AI Programming Workflow That Doubles Your Efficiency
A practical guide to Claude Code for test development: auto-generating test scripts, Plan Mode workflows, MCP + Playwright integration, and Subagent parallel tasks to build systematic AI-assisted workflows.
Read more →

2026年6月8日·3 min
Hermes Agent Hands-On Review: An AI Efficiency Revolution for Indie Game Developers
Indie game developer reviews Hermes Agent vs OpenClaude: intelligent context compression, real-time Memory, remote control via Telegram, and practical use cases in game dev, social media, and email.
Read more →

2026年6月8日·4 min
Vibe Coding Beginner's Guide: Tool Selection Across Three Categories with Practical Examples
A comprehensive guide to Vibe Coding's three tool categories: Agent frameworks, CLI Coding, and IDE tools, with practical examples including Snake game and data analysis workbench.
Read more →