Claude Code Multi-Agent Collaboration in Practice: Subagent Division of Labor and a Lite TARS Concept

Multi-Agent experiment reveals the path and bottlenecks of AI programming evolving from tools to collaborative systems.
A developer built a Planner+Packager dual-Subagent pipeline with Claude Code, successfully achieving semi-automated project iteration. They then envisioned a top-level Agent "TAS" to orchestrate multiple Agents, but were blocked by the hard limitation that Subagents cannot call each other. The experiment reveals that single-point automation has hit its ceiling, inter-Agent communication protocols will be the next competitive focus, and AI programming is evolving from tools to collaborative systems.
Introduction: What's the Real Challenge in AI Collaboration?
When we talk about AI programming tools, most people focus on the capabilities of a single Agent—how well it can write code, how complex the requirements it can understand. But one developer (Bilibili creator "路过的阿诚") spent an entire day experimenting with Claude Code and DeepSeek V4 on something far more ambitious: breaking the project iteration workflow into multiple Subagents and having the main conversation "direct" smaller AIs to do the work.
The conclusion was surprising—the workflow did run successfully, but the real challenge wasn't "whether division of labor is possible," but rather "whether these Subagents can collaborate with each other." This discovery also sparked his idea of building a "lite TARS" (a system that orchestrates and coordinates multiple Agents).

The Basic Subagent Pipeline: How Planner + Packager Divide the Work
Role Definitions for Two Subagents
The first step of the experiment was defining two functionally distinct Subagents:
- Planner: Responsible for analyzing requirements, planning the changes for the current iteration, and outputting three tiers of proposals (high/medium/low) for selection
- AutoPackage (Packager): Responsible for automatically packaging the project after code iteration is complete
There's a key architectural design worth noting here: Subagents have independent contexts. They only bring the final result back to the main conversation without consuming the main conversation's context space. Behind this design lies the most fundamental physical constraint of LLMs (Large Language Models)—the token window. Current mainstream models have context windows ranging from 32K to 200K tokens, and once exceeded, information "forgetting" or reasoning quality degradation occurs. By isolating subtasks into independent Subagents where the main conversation only receives final results rather than lengthy intermediate reasoning processes, this is essentially an elegant "context compression" strategy. This means the Planner can perform extensive analysis and reasoning during planning, and none of these intermediate processes will "pollute" the main conversation's token window.
Complete Workflow
The pipeline runs in the following order:
- Call the Planner through the main conversation to plan the current changes
- Planner returns proposals (three tiers: high/medium/low) to the main conversation
- The main conversation begins iterating code based on the Planner's proposal
- After code iteration is complete, call AutoPackage for automatic packaging
- Packaging succeeds, run verification

Experimental results showed that this pipeline runs successfully. The entire process from planning to packaging achieved semi-automation—human developers only need to make decisions at key checkpoints, while the specific planning and packaging work is handled by the corresponding Subagents.
A Bigger Ambition: Envisioning a "Lite TARS" Multi-Agent Coordination System
The Leap from Division of Labor to Collaboration
After getting the basic pipeline working, a bolder idea naturally emerged: What if there could be a top-level Agent above the Planner and AutoPackage—one with personality, familiar with the developer's habits, capable of orchestrating multiple Agents?

This top-level Agent was named "TAS"—a tribute to the robot TARS from the movie Interstellar. TARS's design features include high autonomy, adjustable "honesty" and "humor" parameters, and the ability to make independent judgments in extreme environments. Naming the top-level coordinating Agent after it hints at the developer's deeper expectations for AI systems: not just a tool that executes instructions, but a collaborative partner with judgment that can make autonomous decisions based on context—highly aligned with the emerging "Agentic AI" concept in the AI Agent field. TAS's design goals include:
- Orchestration: Managing multiple subordinate Agents including Planner, Coder, Reviewer, etc.
- Intelligent Decision-Making: After the Reviewer audits code, TAS decides based on preset principles whether to directly modify or require re-planning
- Tiered Processing: Major changes require human input; minor changes are handled automatically
- Personalization: Making judgments based on principles and preferences previously provided by the developer

It's worth noting that multi-Agent collaboration isn't a new concept—its theoretical foundation traces back to Multi-Agent Systems (MAS) research in the 1980s. In the contemporary AI programming space, OpenAI's Swarm framework, Microsoft's AutoGen, and LangChain's Agent framework are all exploring similar architectures. The core contradiction these systems face is highly consistent: how to achieve efficient cross-Agent information flow while maintaining each Agent's focus—and this is precisely the wall this experiment hit.
The Reality Barrier: Subagents Cannot Call Each Other
However, the vision was grand but reality was harsh. When attempting to implement this architecture, a hard limitation surfaced: Claude Code's Subagents cannot call each other.
This means that when the Reviewer finds issues, it cannot directly notify the Coder to fix them; when the Coder completes modifications, it cannot automatically trigger the Reviewer for re-review. All information flow must pass through the main conversation, which severely limits the flexibility and efficiency of multi-Agent collaboration.
Behind this limitation are deep architectural design trade-offs. Allowing Agents to freely call each other introduces risks of circular dependencies, deadlocks, and infinite recursion, while dramatically increasing system debugging difficulty. The industry's current mainstream solutions fall into three categories: message queues (event-driven architecture where Agents collaborate through publish/subscribe messaging), shared state storage (such as vector databases serving as a "bulletin board" between Agents, where each Agent reads and writes shared memory), and orchestration layers (an Orchestrator handles unified scheduling, with all Agents communicating only with the orchestration layer rather than directly with each other). Claude Code's current architecture is closest to the third approach, but the orchestration layer role is filled by the human main conversation and hasn't yet been automated.
After consulting Claude's official documentation, this was confirmed as a current architectural limitation. In the developer's own words, it was like "the enterprise failed halfway before reaching its goal."
Future Directions: Can Claude Code's Team Feature Break Through?
From Solo Operations to Team Collaboration
Although the current Subagent architecture has limitations, exploration hasn't stopped. Further research revealed that Claude Code is beta-testing a Team feature, which could be the key to breaking through current limitations.
The core philosophy of the Team feature aligns perfectly with the TAS concept—enabling multiple AI Agents to collaborate within a higher-level framework, rather than merely completing tasks independently and reporting results. From a technical path perspective, the Team feature likely introduces some form of automated orchestration layer, making information flow between Agents no longer entirely dependent on human main conversation relay, thereby unlocking the true potential of multi-Agent collaboration while maintaining architectural safety.
Core Insights from This Experiment
Although this experiment was temporarily blocked in implementing the "lite TARS," it revealed several important industry trends:
- The ceiling of single-point automation is already visible: No matter how powerful a single AI Agent is, it needs division of labor and collaboration when facing complex projects
- Context management is a critical architectural decision: The independent context design of Subagents is fundamentally solving the LLM token window limitation problem
- Inter-Agent communication protocols will become the next competitive focus: Whoever can first solve the Agent inter-calling problem—whether through message queues, shared state, or orchestration layers—will gain an advantage in the AI programming tools race
- The boundary of human-AI collaboration is dynamically shifting: From "humans write code" to "humans direct AI to write code" to "humans only make key decisions while AI coordinates and completes the rest"—this evolutionary path is already clearly visible
Conclusion
The value of this experiment lies not in whether TARS was ultimately built successfully, but in how it clearly demonstrates the evolutionary path and current technical bottlenecks of AI programming tools transitioning from "tools" to "collaborative systems." The Planner + Packager basic pipeline proves the feasibility of multi-Agent division of labor, while the Subagent inter-calling limitation points to the direction that needs to be broken through next.
For developers currently using Claude Code, the current stage allows implementing planning and packaging automation with Subagents while keeping an eye on the Team feature's beta progress. For the entire AI programming field, what's truly valuable isn't single-point automation, but AI collaborative systems—a judgment that will likely be repeatedly validated over the next year or two.
Related articles
TutorialsCursor + Codex Dual-IDE Collaboration: A Practical Methodology for Open-Source Project Customization
A complete methodology for open-source project customization based on real-world experience, detailing the Cursor+Codex dual-IDE workflow, seven-stage process, MVP validation, and AI source code reading techniques.
TutorialsCursor Multi-Agent in Practice: Building a Full-Stack Next.js Blog in 50 Minutes
Build a full-stack blog in 50 minutes using Cursor IDE's multi-Agent mode with Next.js, Clerk auth, and Supabase. Learn the 4-phase AI Agent workflow and key integration pitfalls.
TutorialsBuilding an AI Software Factory from Scratch: A Cursor Engineer's Hands-On Experience with Multi-Agent Collaboration
Cursor engineer Eric shares practical insights on building an AI software factory: automation levels, guardrail design, parallel Agent management, and scaling to 1000+ Agents for 24/7 development.