Claude Code Workflow in Action: 68 Sub-Agents Working Concurrently

Claude Code's Workflow mode dispatched 68 concurrent sub-agents for batch prompt testing.
A deep dive into Claude Code's hidden Workflow mode, which enables multi-agent concurrent programming. By switching to Super Coding mode, the AI automatically decomposes complex tasks and dispatches sub-agents in parallel. In a real-world test involving batch prompt validation for a digital avatar project, 68 sub-agents were orchestrated across four rounds with a write-review separation mechanism for quality control. The trade-off: massive token consumption.
Claude Code's Hidden "Super Coding" Mode
Claude Code (Anthropic's command-line programming tool) has a powerful but little-known feature hidden within it — Workflow mode. This mode enables AI to automatically decompose complex tasks, dispatch dozens of sub-agents to execute concurrently, and achieve true "AI assembly line" operations. Recently, a Chinese tech content creator put this feature through its paces on a real project, with a single task dispatching 68 sub-agents — a stunning display of efficiency.
Unlike the common IDE plugin-style AI assistants (such as GitHub Copilot, Cursor, etc., which primarily offer code completion and chat within an editor), Claude Code is a command-line native tool that runs in the terminal. It can directly access the file system, execute Shell commands, and operate Git repositories, giving it deep awareness of the entire project. This design makes it naturally suited for more complex engineering-level task orchestration, rather than just single-file code assistance. Workflow mode is an advanced feature built on top of this architectural advantage.
How to Enable Workflow Mode
Enabling this feature isn't complicated, but it does require manual configuration:
- Enter the
/effortcommand in Claude Code - The default thinking mode is usually
hi(high thinking mode) - Switch it to the highest level:
also coder(Super Coding Mode)

Once enabled, Claude Code will automatically decide whether to dispatch sub-agents based on the complexity of your submitted task. The "sub-agent" here is a core concept in Multi-Agent Systems — each sub-agent is an independent AI reasoning instance with its own context window and execution environment, capable of autonomously completing assigned subtasks and returning results to the main agent. Unlike simple function calls, sub-agents possess a degree of autonomous decision-making ability, allowing them to plan execution steps, invoke tools, and even perform multi-round reasoning as needed. This architecture enables complex tasks to be truly "divided and conquered."
At its core, this is a self-organizing programming workflow — the AI constructs a complete task orchestration plan on its own (similar to Harness), without requiring users to manually break down tasks. Harness, mentioned here, is a well-known CI/CD orchestration platform in the DevOps space. Its core capability is decomposing software delivery processes into multiple stages and steps, defining dependencies and parallelization strategies between them, and then automating the entire pipeline. Claude Code's Workflow mode borrows a similar philosophy: the AI plays the role of an "orchestration engine," automatically analyzing task dependencies, determining which subtasks can run in parallel and which need to wait in sequence, ultimately forming a dynamic task DAG (Directed Acyclic Graph). The difference is that traditional CI/CD tools require manually writing pipeline configuration files, whereas Claude Code's workflow is entirely generated autonomously by the AI.
Real-World Scenario: Batch Testing a Digital Avatar's Writing Skills
Task Background
The creator was developing a digital avatar project that includes a Writing Skill module. This module contains a large number of prompts for different scenarios, each of which needed to be tested individually for writing quality.

Task Division: Write-Review Separation
The task design was quite clever, employing a "write-review separation" principle:
- Main agent: Responsible for overall supervision and task scheduling, as well as final acceptance review
- Sub-agents: Dispatched to execute specific writing tasks, generating content using different prompts
- Review phase: Unified review by the main agent, ensuring "those who write don't review, and those who review don't write"

This design draws from the Code Review philosophy in software engineering — producers and reviewers must be separated to ensure quality. In software engineering practice, this principle has deep theoretical foundations: psychological research shows that content creators have a natural "Confirmation Bias" toward their own output, tending to overlook flaws in their own work. From early "Peer Review" practices to the Pull Request review mechanisms widely adopted in modern open-source communities, "separation of production and review responsibilities" has become one of the cornerstone principles of software quality assurance. Google's internal engineering practices even require every line of code to be reviewed by at least one non-author engineer before it can be merged into the main branch. Claude Code transfers this human engineering wisdom to AI multi-agent collaboration, having different AI instances take on the roles of "producer" and "reviewer" respectively, thereby building a self-consistent quality control loop within the AI system.
Claude Code's Workflow mode natively supports this kind of multi-role collaboration.
Dispatching 68 Sub-Agents in Action
The entire testing process went through four rounds, dispatching a cumulative total of 68 sub-agents. By entering the /workflow command, you can clearly see the current workflow orchestration status.

From the screenshots, the specific concurrency looked like this:
- Round 1: Concurrently launched 9 sub-agents
- Round 2: Another 9 sub-agents launched concurrently, with approximately 10 running in total including those from the previous round
- Subsequent rounds: Continued high-concurrency dispatching, totaling 68 across four rounds
This means Claude Code doesn't execute tasks serially one by one — it can run multiple sub-agents concurrently, dramatically reducing overall execution time. The four rounds of prompt testing were "basically all completed very quickly."
Advantages and Costs of Workflow Mode
Core Advantages
- Automatic task decomposition: No manual planning needed; the AI autonomously determines the workflow structure based on task complexity
- High-concurrency execution: Up to 9 sub-agents can run concurrently per round, far more efficient than manual sequential testing
- Built-in quality control: Supports collaboration patterns like write-review separation, with the main agent handling acceptance review
- Visual monitoring: Use the
/workflowcommand to view task orchestration and execution status in real time
The Unavoidable Token Consumption Cost
The biggest drawback of this feature is — it's extremely token-hungry.
68 sub-agents means 68 independent AI calls, with each sub-agent needing to receive context, execute tasks, and return results. Add in the main agent's scheduling and review overhead, and the total token consumption is dozens of times that of a normal conversation. For users billed by token, this represents a significant expense.
To understand the technical root of this cost, you need to understand how LLM token billing works. Tokens are the basic units by which models process text — Chinese roughly maps to one token per 1-2 characters, while English maps to about 1-1.5 tokens per word. The cost of each AI call is determined by both "input tokens" and "output tokens," where input tokens include system prompts, context information, and user instructions. In multi-agent scenarios, token consumption grows multiplicatively for three reasons: First, each sub-agent needs to independently receive task context (including project background, code structure, specific instructions, etc.), and this context information is transmitted repeatedly; Second, the main agent needs to maintain global state during scheduling, and its context window continuously expands as subtasks increase; Third, the review phase requires the main agent to examine all sub-agent outputs one by one, generating substantial input token overhead. Using Claude 3.5 Sonnet's API pricing as a reference (approximately $3/million input tokens, $15/million output tokens), the scheduling cost of 68 sub-agents could reach 50-100 times that of a single normal conversation. However, Claude Code's Max subscription plans ($100/month or $200/month) provide a certain monthly usage allowance, somewhat alleviating this cost pressure.
Use Cases and Recommendations
Based on this hands-on test, Claude Code's Workflow mode is particularly well-suited for the following scenarios:
- Batch testing: Such as the bulk prompt validation in this example
- Large-scale code refactoring: Scenarios requiring simultaneous modifications across multiple files
- Parallel multi-module development: When modules are relatively independent and can be processed concurrently
- Automated QA: Quality assurance processes requiring write-review separation
Recommendations:
- Only enable Super Coding mode when tasks are sufficiently complex; use the default mode for simple tasks
- Estimate your token budget in advance to avoid unexpected overspending
- Make good use of the
/workflowcommand to monitor execution progress - Clearly specify division-of-labor requirements in your task descriptions (e.g., write-review separation) to help the AI better orchestrate the workflow
Conclusion
Claude Code's Workflow feature demonstrates an important direction for AI programming tools: moving from single-conversation coding to multi-agent collaborative coding. This trend isn't unique to Claude Code — the entire AI programming tool industry is evolving toward multi-agent architectures. OpenAI's Codex CLI has begun supporting multi-step task orchestration; Google's Jules (a Gemini-based AI coding agent) similarly employs task decomposition and parallel execution design principles; and in the open-source community, multi-agent frameworks like AutoGen and CrewAI are rapidly maturing, providing developers with infrastructure for building their own multi-agent programming systems. This paradigm shift from "solo operations to team collaboration" fundamentally reflects AI capability leaping from the "tool level" to the "system level" — AI is no longer just an assistant that answers questions or completes code, but is beginning to possess systemic engineering capabilities like project management, task allocation, and quality control.
The scenario of 68 sub-agents working concurrently is already very close to the collaboration model of a small development team. While token consumption is the main bottleneck today, as model inference costs continue to decline (over the past two years, API prices for mainstream LLMs have dropped by over 90%, and this trend is still accelerating), this "AI team operations" model will very likely become the dominant paradigm for AI-assisted programming in the future. Developers who are interested should give it a try firsthand.
Related articles

Allbirds Pivots to AI: A Plan and Funding, but Zero Employees
Allbirds pivots from eco-friendly shoes to AI with a new CEO, ample funding, but zero AI employees. A deep dive into the logic, challenges, and industry lessons of a consumer brand's AI leap.

India Bans Telegram: VPN Downloads Surge as Users Flock to Alternative Apps
India's Telegram ban triggers a surge in VPN downloads as users migrate to Signal, WhatsApp, and other alternatives, sparking global debate on internet freedom vs. content governance.

OpenAI o3 Helps Boston Children's Hospital Tackle Rare Genetic Disease Diagnosis Challenges
OpenAI's o3 Deep Research model partners with Boston Children's Hospital to assist rare genetic disease diagnosis. Published in NEJM AI, this human-AI collaboration shortens diagnostic timelines and advances precision medicine.