Claude Code Workflow in Action: 68 Sub-Agents Working Concurrently

Claude Code's Hidden "Super Coding" Mode

Claude Code (Anthropic's command-line programming tool) has a powerful but little-known feature hidden within it — Workflow mode. This mode enables AI to automatically decompose complex tasks, dispatch dozens of sub-agents to execute concurrently, and achieve true "AI assembly line" operations. Recently, a Chinese tech content creator put this feature through its paces on a real project, with a single task dispatching 68 sub-agents — a stunning display of efficiency.

Unlike the common IDE plugin-style AI assistants (such as GitHub Copilot, Cursor, etc., which primarily offer code completion and chat within an editor), Claude Code is a command-line native tool that runs in the terminal. It can directly access the file system, execute Shell commands, and operate Git repositories, giving it deep awareness of the entire project. This design makes it naturally suited for more complex engineering-level task orchestration, rather than just single-file code assistance. Workflow mode is an advanced feature built on top of this architectural advantage.

How to Enable Workflow Mode

Enabling this feature isn't complicated, but it does require manual configuration:

Enter the /effort command in Claude Code
The default thinking mode is usually hi (high thinking mode)
Switch it to the highest level: also coder (Super Coding Mode)

Enabling Claude Code Super Coding Mode

Once enabled, Claude Code will automatically decide whether to dispatch sub-agents based on the complexity of your submitted task. The "sub-agent" here is a core concept in Multi-Agent Systems — each sub-agent is an independent AI reasoning instance with its own context window and execution environment, capable of autonomously completing assigned subtasks and returning results to the main agent. Unlike simple function calls, sub-agents possess a degree of autonomous decision-making ability, allowing them to plan execution steps, invoke tools, and even perform multi-round reasoning as needed. This architecture enables complex tasks to be truly "divided and conquered."

At its core, this is a self-organizing programming workflow — the AI constructs a complete task orchestration plan on its own (similar to Harness), without requiring users to manually break down tasks. Harness, mentioned here, is a well-known CI/CD orchestration platform in the DevOps space. Its core capability is decomposing software delivery processes into multiple stages and steps, defining dependencies and parallelization strategies between them, and then automating the entire pipeline. Claude Code's Workflow mode borrows a similar philosophy: the AI plays the role of an "orchestration engine," automatically analyzing task dependencies, determining which subtasks can run in parallel and which need to wait in sequence, ultimately forming a dynamic task DAG (Directed Acyclic Graph). The difference is that traditional CI/CD tools require manually writing pipeline configuration files, whereas Claude Code's workflow is entirely generated autonomously by the AI.

Real-World Scenario: Batch Testing a Digital Avatar's Writing Skills

Task Background

The creator was developing a digital avatar project that includes a Writing Skill module. This module contains a large number of prompts for different scenarios, each of which needed to be tested individually for writing quality.

Writing Skill Module Interface

Task Division: Write-Review Separation

The task design was quite clever, employing a "write-review separation" principle:

Main agent: Responsible for overall supervision and task scheduling, as well as final acceptance review
Sub-agents: Dispatched to execute specific writing tasks, generating content using different prompts
Review phase: Unified review by the main agent, ensuring "those who write don't review, and those who review don't write"

Write-Review Separation Workflow

This design draws from the Code Review philosophy in software engineering — producers and reviewers must be separated to ensure quality. In software engineering practice, this principle has deep theoretical foundations: psychological research shows that content creators have a natural "Confirmation Bias" toward their own output, tending to overlook flaws in their own work. From early "Peer Review" practices to the Pull Request review mechanisms widely adopted in modern open-source communities, "separation of production and review responsibilities" has become one of the cornerstone principles of software quality assurance. Google's internal engineering practices even require every line of code to be reviewed by at least one non-author engineer before it can be merged into the main branch. Claude Code transfers this human engineering wisdom to AI multi-agent collaboration, having different AI instances take on the roles of "producer" and "reviewer" respectively, thereby building a self-consistent quality control loop within the AI system.

Claude Code's Workflow mode natively supports this kind of multi-role collaboration.

Dispatching 68 Sub-Agents in Action

The entire testing process went through four rounds, dispatching a cumulative total of 68 sub-agents. By entering the /workflow command, you can clearly see the current workflow orchestration status.

Claude Code Workflow Orchestration View

From the screenshots, the specific concurrency looked like this:

Round 1: Concurrently launched 9 sub-agents
Round 2: Another 9 sub-agents launched concurrently, with approximately 10 running in total including those from the previous round
Subsequent rounds: Continued high-concurrency dispatching, totaling 68 across four rounds

This means Claude Code doesn't execute tasks serially one by one — it can run multiple sub-agents concurrently, dramatically reducing overall execution time. The four rounds of prompt testing were "basically all completed very quickly."

Advantages and Costs of Workflow Mode

Core Advantages

Automatic task decomposition: No manual planning needed; the AI autonomously determines the workflow structure based on task complexity
High-concurrency execution: Up to 9 sub-agents can run concurrently per round, far more efficient than manual sequential testing
Built-in quality control: Supports collaboration patterns like write-review separation, with the main agent handling acceptance review
Visual monitoring: Use the /workflow command to view task orchestration and execution status in real time

The Unavoidable Token Consumption Cost

The biggest drawback of this feature is — it's extremely token-hungry.

68 sub-agents means 68 independent AI calls, with each sub-agent needing to receive context, execute tasks, and return results. Add in the main agent's scheduling and review overhead, and the total token consumption is dozens of times that of a normal conversation. For users billed by token, this represents a significant expense.

To understand the technical root of this cost, you need to understand how LLM token billing works. Tokens are the basic units by which models process text — Chinese roughly maps to one token per 1-2 characters, while English maps to about 1-1.5 tokens per word. The cost of each AI call is determined by both "input tokens" and "output tokens," where input tokens include system prompts, context information, and user instructions. In multi-agent scenarios, token consumption grows multiplicatively for three reasons: First, each sub-agent needs to independently receive task context (including project background, code structure, specific instructions, etc.), and this context information is transmitted repeatedly; Second, the main agent needs to maintain global state during scheduling, and its context window continuously expands as subtasks increase; Third, the review phase requires the main agent to examine all sub-agent outputs one by one, generating substantial input token overhead. Using Claude 3.5 Sonnet's API pricing as a reference (approximately $3/million input tokens, $15/million output tokens), the scheduling cost of 68 sub-agents could reach 50-100 times that of a single normal conversation. However, Claude Code's Max subscription plans ($100/month or $200/month) provide a certain monthly usage allowance, somewhat alleviating this cost pressure.

Use Cases and Recommendations

Based on this hands-on test, Claude Code's Workflow mode is particularly well-suited for the following scenarios:

Batch testing: Such as the bulk prompt validation in this example
Large-scale code refactoring: Scenarios requiring simultaneous modifications across multiple files
Parallel multi-module development: When modules are relatively independent and can be processed concurrently
Automated QA: Quality assurance processes requiring write-review separation

Recommendations:

Only enable Super Coding mode when tasks are sufficiently complex; use the default mode for simple tasks
Estimate your token budget in advance to avoid unexpected overspending
Make good use of the /workflow command to monitor execution progress
Clearly specify division-of-labor requirements in your task descriptions (e.g., write-review separation) to help the AI better orchestrate the workflow

Conclusion

Claude Code's Workflow feature demonstrates an important direction for AI programming tools: moving from single-conversation coding to multi-agent collaborative coding. This trend isn't unique to Claude Code — the entire AI programming tool industry is evolving toward multi-agent architectures. OpenAI's Codex CLI has begun supporting multi-step task orchestration; Google's Jules (a Gemini-based AI coding agent) similarly employs task decomposition and parallel execution design principles; and in the open-source community, multi-agent frameworks like AutoGen and CrewAI are rapidly maturing, providing developers with infrastructure for building their own multi-agent programming systems. This paradigm shift from "solo operations to team collaboration" fundamentally reflects AI capability leaping from the "tool level" to the "system level" — AI is no longer just an assistant that answers questions or completes code, but is beginning to possess systemic engineering capabilities like project management, task allocation, and quality control.

The scenario of 68 sub-agents working concurrently is already very close to the collaboration model of a small development team. While token consumption is the main bottleneck today, as model inference costs continue to decline (over the past two years, API prices for mainstream LLMs have dropped by over 90%, and this trend is still accelerating), this "AI team operations" model will very likely become the dominant paradigm for AI-assisted programming in the future. Developers who are interested should give it a try firsthand.