Claude Code Workflow in Practice: From Requirement Grilling to AFK Agent Auto-Coding

Introduction: From Philosophy to Practice

Much of the content about AI programming stays at the conceptual level — "treat AI as a junior developer," "focus on architecture, not implementation." But the real question is: how exactly do you do it? A senior developer recently shared his complete workflow for building real features with Claude Code — from vague requirements to working code, with every step well-documented.

The core project in this case is a course video manager with roughly 1,200 commits and 637 closed Issues. The tech stack is React Router + TypeScript + Node + Drizzle ORM + PostgreSQL, backed by extensive Vitest tests. This isn't a demo project — it's a tool running in a real production environment.

The choice of tech stack itself is worth noting. Drizzle ORM is a type-safe ORM framework that has risen rapidly in the TypeScript ecosystem in recent years. Unlike competitors such as Prisma, it adopts an API design that closely mirrors native SQL syntax, allowing developers to write queries in TypeScript that are nearly identical to raw SQL while enjoying full type inference. This design philosophy makes it particularly popular in scenarios requiring fine-grained control over database operations. React Router, after merging with the Remix framework in v7, became a full-stack framework supporting server-side rendering and data loading — which explains why this project can handle both frontend and backend logic with it. Combined with Vitest, a testing framework known for its blazing-fast startup speed, the entire stack provides the rapid-feedback infrastructure essential for AI-assisted development.

Phase 1: Grill Me — Refining Requirements Through Interrogative Dialogue

From Vague Ideas to Precise Definitions

The feature to be built revolves around the concept of "Ghost Courses." In the existing system, courses have two states — "ghost" and "real." Ghost courses exist only in the database, while real courses are also mapped to the file system. The pain points in actual use included:

Creating a real lesson requires first creating a ghost lesson, then manually materializing it — a tedious process
Deleting a real lesson can't be done directly; it must first be converted to ghost state
There's no way to create a "ghost course" — a planned course not yet mapped to the file system

These requirements were initially very rough. The developer dictated his ideas directly to Claude Code, then entered "Grill Me" mode — letting the AI ask questions back to gradually clarify the requirement boundaries.

Key Questions Raised by AI

After exploring the codebase, Claude Code posed a series of precise questions:

"Looking at the code, the course write service's lesson deletion already handles both ghost and real courses. Is there something in the UI that forces the convert-to-ghost-then-delete workflow?"

UI Gap Analysis

This question pinpointed the real issue — it wasn't a missing backend logic problem, but rather the UI not exposing a direct deletion entry point. The AI then asked an even deeper question:

"Ghost courses have no file path, and their sections and lessons would all be ghost by necessity. When you say a ghost course can have real lessons, what does 'real' mean without a file system?"

This forced the developer to clarify a critical design decision: when creating a real lesson within a ghost course, it should trigger a Materialization Cascade — a modal pops up requesting a file path, then the course, sections, and lessons are all materialized at once.

The Materialization Cascade concept hides a classic distributed consistency challenge. When an operation needs to simultaneously modify database records and the file system, it faces a transaction boundary problem. Databases support transactional rollback, but file system operations (creating directories, writing files) typically don't support atomic rollback. The "materialization failure in non-Git repos causing desynchronization" issue discovered during the later QA phase is a textbook manifestation of this challenge. In Git-enabled scenarios, Git's version control capabilities can serve as a compensation mechanism: execute file operations first, and if the database transaction fails, roll back file changes via git reset. This Compensating Transaction pattern is a common strategy for handling cross-system consistency, also known as a simplified version of the Saga pattern.

Ubiquitous Language: The Foundation of Human-AI Communication

Throughout the Grill Me conversation, the developer placed special emphasis on the Ubiquitous Language concept from Domain-Driven Design (DDD). He maintained a "Ubiquitous Language" document defining terms like "ghost entity," "materialization," and "materialization cascade." The benefits are obvious:

AI can precisely understand terminology when searching the codebase
Subsequent conversations can use concise statements like "there's a bug in the materialization cascade"
Function naming becomes natural and consistent

Ubiquitous Language is one of the core concepts proposed by Eric Evans in his book Domain-Driven Design. The central idea is that the development team and business experts should establish a shared, rigorously defined terminology system — one that's used not only in daily communication but also directly reflected in code naming. Class names, method names, and variable names should all align with the ubiquitous language. In traditional teams, ubiquitous language solves the problem of communication ambiguity between people; in the context of AI-assisted development, its value is amplified further. Large language models rely on context to understand semantics. When naming in the codebase is highly consistent with terminology in documentation, the model can more accurately locate relevant code, understand business intent, and generate new code that follows project conventions. This essentially extends DDD's collaborative philosophy from human-human to human-AI.

This 22-minute grilling session essentially accomplished all the work of a traditional requirements review meeting.

Phase 2: Automated Breakdown from PRD to Issues

Auto-Generating the Product Requirements Document

After the grilling, the developer had Claude Code organize the conversation into a PRD (Product Requirements Document). Since questions and answers were closely adjacent throughout the Q&A process, the attention mechanism could effectively capture contextual associations, resulting in a high-quality PRD.

The attention mechanism mentioned here is the core principle of the Transformer architecture. It allows the model, when processing a given position in a sequence, to dynamically "attend to" information at all other positions and assign different weights based on relevance. In the Grill Me conversation, questions and answers are physically adjacent in the text sequence, meaning that when the model generates the PRD, each requirement point can efficiently "look back" at the corresponding Q&A context without losing associations due to information being scattered across different positions in a long text. This also explains why completing requirements discussions in a single continuous conversation is more effective than spreading them across multiple sessions — the former maintains information locality within the context window.

Module Interface Design

A particularly interesting part of the PRD was the modular design section. The AI clearly outlined the changes needed for each module:

Course write service: add a materializeCascade method
Database schema: make the course filePath field nullable
Two new API endpoints: direct real lesson creation, direct real lesson deletion
UI layer: materialization modal, two types of create buttons, delete operations

During review, the developer focused on interface design rather than implementation details — ensuring testability and ensuring future AI agents could understand module responsibilities.

Breaking Down into Executable GitHub Issues

The PRD was automatically broken down into 4 independent Issues, each containing:

User stories
Acceptance criteria
Blocking relationships
Testing decisions

The inclusion of testing decisions was particularly crucial — it guides subsequent AI agents toward a TDD approach, establishing a test feedback loop before coding. Test-Driven Development (TDD) requires writing test cases before writing feature code, following the "red-green-refactor" cycle: first write a failing test (red), then write the minimum code to make the test pass (green), and finally refactor to keep the code clean. This methodology has a natural synergy with AI agent coding — for AI, test cases provide a clear, machine-verifiable success criterion. The agent doesn't need a human to judge code correctness in real time; it simply runs the test suite for instant feedback. Vitest is known for its extremely fast startup speed and hot module replacement capabilities, enabling AI agents to get test results within seconds after each code change, dramatically accelerating this automated iteration loop.

Phase 3: AFK Agent — Step Away and Let AI Do the Work

Sandcastle Architecture Explained

The developer built an AFK (Away From Keyboard) agent runtime environment called "Sandcastle." The core architecture works as follows:

A Docker container mounts the current working directory
Claude Code runs inside the container, reads GitHub Issues and implements them one by one
Generated commits are extracted as patches and applied to the local repository
The loop executes with a maximum iteration count of 100

Running the AI agent in a Docker container is a design choice with multiple considerations. Docker containers provide process-level isolation, ensuring the AI agent's file system operations don't accidentally damage the host environment. Through volume mounting, code changes inside the container sync to the host in real time, while the patch extraction mechanism provides an additional safety layer — rather than letting AI modify the local repository directly, it first generates patch files, which an automation script selectively applies. This "sandbox" pattern is critical in AI autonomous coding scenarios because large language models may execute unexpected commands (such as accidentally deleting files or modifying system configurations), and containerization limits potential damage to a controllable scope. The 100 maximum iteration cap serves as a cost control mechanism, preventing the agent from falling into infinite loops that consume API call quotas.

The launch command is as simple as pnpm wealth, and then the developer can step away to make tea, take a walk, or even chat with family.

Local Build and Testing

About an hour and a half later, the agent had completed 5 iterations, generating 6 commits, each with detailed commit messages.

Day Shift and Night Shift: A Metaphor for Human-AI Division of Labor

This working pattern is vividly described as "day shift" and "night shift":

Day shift (human): Think, grill, organize PRD, break down Issues, perform QA acceptance
Night shift (AI agent): Read Issues, write code, run tests, commit code

The core insight is: once thinking and requirement definition are complete, the human's work is essentially done — until quality assurance on the output is needed.

Phase 4: QA Loop — Discovering Edge Cases AI Can't Foresee

Feedback-Driven Iterative Testing

The developer had Claude Code generate a QA plan based on recent commits, then manually tested each item. Multiple issues were quickly discovered:

The add-course modal had an unnecessary "ghost/real" tab toggle
After creating a ghost course, there was no automatic navigation to the new page, and the modal didn't close
Missing loading states, causing a confusing user experience

More seriously, there was an edge case no one had anticipated: if the course repository isn't a Git repo, a failure during materialization would leave the file system and database in a desynchronized state. This is precisely the distributed consistency problem mentioned earlier surfacing in a real scenario — without Git as a compensating transaction rollback mechanism, file system changes can't be atomically undone, and if the cascade operation fails midway, the system ends up in a "half-materialized" inconsistent state.

"This kind of thing makes me start thinking — pure upfront planning just doesn't work. When you're in the QA loop iterating constantly, you encounter these weird edge cases that are really hard to plan for in advance."

Parallel Issue Creation and Automated Fixes

For each problem discovered, the developer automatically created a GitHub Issue via a feedback button, while the AFK agent continuously fixed them in the background. Within 8 minutes, 6-7 Issues were created, with the agent processing them in parallel. After 8 iterations and 14 commits, the feature was essentially complete.

Core Methodology Summary

This case demonstrates not some magical prompt trick, but a complete human-AI collaborative engineering methodology:

Grill Me Requirement Interrogation: Let AI ask questions back to transform vague ideas into precise requirements (~22 minutes)
Ubiquitous Language Maintenance: Establish domain terminology documentation to ensure zero ambiguity in human-AI communication
Automated PRD Generation: Leverage conversation context to automatically output product requirements documents
Automated Issue Breakdown: Decompose the PRD into independently executable tasks with testing decisions included
AFK Agent Execution: Claude Code autonomously completes coding and testing inside Docker
QA Iteration Loop: Manual testing → Create Issues → Agent fixes → Retest

Throughout the entire process, the developer barely looked at a single line of code. He focused on inputs (requirement definition) and outputs (feature behavior), not the implementation process. This is perhaps a microcosm of the role transformation for software engineers in the AI era: from people who write code to people who define problems and verify results.