Claude Code Agent Architecture Source Code Deep Dive: Agent Loop and Tool System in Practice

Two Core Directions for Developers in the AI Era

With AI technology advancing at breakneck speed, frontend and full-stack developers face unprecedented pressure to transform. Industry analysis widely suggests that pure frontend positions may shrink significantly within the next one to two years, and developers must establish core competitiveness in two directions: AI-powered efficiency and AI Agent development.

AI Agents are one of the most important technology trends in the AI field for 2024-2025. Unlike traditional chatbots, Agents possess autonomous planning, tool usage, and environment awareness capabilities, enabling them to independently complete complex multi-step tasks. Leading companies like OpenAI, Anthropic, and Google all view Agents as the core paradigm for next-generation AI applications. Anthropic's Claude Code, OpenAI's Codex CLI, and Microsoft's GitHub Copilot Agent mode mark the evolution of AI coding assistants from "code completion" to an agentic paradigm that "autonomously completes development tasks."

AI-powered efficiency means proficiently using tools like Codex and Claude Code to boost coding productivity, letting AI assist development around the clock. AI Agent development means being able to design and implement workflow-based or general-purpose agents for vertical business applications. These two capabilities have become the biggest differentiators in technical interviews.

Course Content Overview

This article takes Claude Code's agent architecture design as an entry point, provides an in-depth analysis of its core modules, and introduces how to build a Mini version of Claude Code from scratch using TypeScript.

Three Levels of Claude Code Interview Questions

Before diving into technical details, let's clarify the learning objectives. Interview questions about Claude Code agents typically fall into three levels:

Junior to Mid-Level: Experience with AI Coding Tools

Interviewers will ask: "Do you have experience with AI coding tools like Codex or Claude Code? How do you apply them in frontend and full-stack development?" This tests tool proficiency and real-world efficiency scenarios.

Mid to Senior Level: Agent Architecture Design Ability

More advanced questions include: "Can you describe the overall design and implementation of a general-purpose agent like Claude Code, including the Agent Loop, Skills, multi-agent collaboration, etc.?" This requires understanding the core architectural principles of agents.

Expert Level: Ability to Independently Develop Agents

The highest-level question: "If you were asked to develop a general-purpose agent similar to Claude Code, what would your approach be?" This tests not only source code understanding but also hands-on agent development experience and system design capability.

Interview Question Levels

Four Core Modules of the Claude Code Agent

As a general-purpose agent, Claude Code's architecture can be broken down into four core modules. Understanding these modules gives you the foundational framework for building any AI Agent.

Module 1: System Prompt

The system prompt is the "soul" of an agent — it defines the Agent's role, capability boundaries, and behavioral guidelines. In Claude Code, the system prompt includes not only basic role definitions but also dynamically injected context information about the current project, enabling the Agent to understand the project structure and make informed decisions.

This dynamic injection mechanism is what sets Claude Code apart from ordinary chatbots — it ensures the model always "knows" which project it's operating on, what tech stack the project uses, and what constraints exist. From a technical evolution perspective, this represents the leap from Prompt Engineering to Context Engineering. Prompt Engineering focuses on optimizing individual prompts, while Context Engineering is a higher-level concept popularized in 2025 by figures like Shopify CEO Tobi Lütke, emphasizing the systematic management of all context information the model receives — including system prompts, conversation history, retrieved documents, tool definitions, memory content, and more. In Agent development, how efficiently you organize this information within a limited context window directly determines the agent's performance quality. The dynamic injection and layered memory system in Claude Code's architecture are engineering implementations of the Context Engineering philosophy.

Architecture Design

Module 2: Agent Loop

The Agent Loop is the core execution mechanism of the entire agent and the most critical part of Claude Code's architecture.

The Agent Loop's design philosophy originates from the ReAct (Reasoning + Acting) paradigm jointly proposed by Google Research and Princeton University in 2022. This paradigm enables LLMs to alternate between "Thought" and "Action" during reasoning: the model first analyzes the current state and goal to decide the next operation; after executing the operation and obtaining an Observation, it continues reasoning based on the new information. This think-act-observe loop enables Agents to handle complex tasks requiring multi-step reasoning and external information retrieval, and serves as the common theoretical foundation for mainstream Agent frameworks (such as LangChain, AutoGPT, and Claude Code).

Specifically in Claude Code, the Agent Loop workflow is as follows:

Receive user input: Parse the user's natural language instructions
Model reasoning: Call the LLM to analyze the task and decide the next action
Tool Calls: Based on the model's output, invoke the appropriate tools to execute operations
Result feedback: Return tool execution results to the model
Loop evaluation: The model decides whether to continue calling tools or output the final result

This loop continues until the task is complete. For example, when a user inputs "analyze the current project structure," the Agent first calls the list_files tool to list all files, then reads the contents of key files one by one, and finally synthesizes the analysis and outputs conclusions.

Tool Calls in the Agent Loop

Module 3: Tool System

The tool system is the bridge between the Agent and the external world. Large language models themselves cannot perform any actual operations — they can't read or write files, execute commands, or access the network. All these capabilities must be implemented through tools.

The tool system's underlying mechanism relies on Function Calling. This is the standardized protocol for LLM-to-external-tool interaction, first introduced by OpenAI in GPT models in June 2023, with Anthropic, Google, and other vendors quickly following suit. The core principle is: developers predefine a set of tool descriptions in JSON Schema format (including function names, parameter types, and capability descriptions), and the model determines during reasoning whether it needs to call a tool, outputting the call instruction in structured JSON format rather than generating a natural language response directly. The host program parses the instruction, executes the actual operation, and feeds the result back to the model for continued reasoning. This mechanism transforms LLMs from "can only talk" to the intelligent core that "can take action."

Claude Code's minimal tool set typically includes:

list_files: List the file structure under a directory
read_file: Read the contents of a specified file
write_file: Create or modify files
execute_command: Execute terminal commands
search_files: Search file contents

Each tool has clearly defined input parameters and output formats, and the model selects and invokes appropriate tools through the Function Calling mechanism. This system gives the agent the real ability to "get things done."

Module 4: Skills and Memory System

Skills are reusable capability modules, similar to predefined workflow templates. The memory system is more complex, divided into multiple levels:

Session memory (short-term memory): Context information from the current conversation, which disappears when the conversation ends
Project-level memory: Knowledge related to a specific project, such as project structure, tech stack, coding conventions, etc.
Team/user-level memory: User preference settings, team development standards, etc.
System-level memory: Global knowledge and configurations

This layered memory mechanism enables the Agent to accumulate and leverage experience at different granularities, avoiding redundant questions about known information and achieving smarter assistance. It's worth noting that the memory system's design is closely tied to the LLM's context window limitations — even Claude 3.5's 200K token window is far from sufficient for all the code in a large project. Therefore, the memory system is essentially an intelligent information retrieval and compression strategy, providing the model with the most relevant background knowledge within a limited context space.

Mini Claude Code in Practice: Building from Scratch

To validate the feasibility of the architecture described above, you can build a Mini version of Claude Code from scratch using TypeScript. Here are the key implementation points.

Technology Choices

Development language: TypeScript (zero barrier for frontend developers)
Local model: Qwen2.5 9B model deployed via Ollama
Execution method: Command-line interaction, launched with npm run dev

About Ollama and local model deployment: Ollama is an open-source local LLM runtime framework that supports one-click deployment and execution of various open-source LLMs on personal computers. It encapsulates complex processes like model downloading, quantized inference, and API serving, providing an OpenAI-compatible REST API interface that significantly lowers the barrier to using local models. The Qwen2.5 9B model mentioned here is an open-source LLM from Alibaba Cloud's Tongyi Qianwen team, where 9B refers to 9 billion parameters — a medium-scale model that runs smoothly on consumer-grade GPUs (e.g., 8GB VRAM). The advantages of choosing a local model include zero API costs and controllable data privacy, making it ideal for learning and prototype validation.

Agent Loop Core Flow

The entire execution flow fully reproduces the Agent Loop mechanism:

User inputs a prompt (e.g., "analyze the current project structure")
The system sends the prompt along with the system prompt to the local model
The model returns Tool Calls instructions (e.g., calling list_files)
The system executes the tool and returns the result to the model
The model continues analysis and may trigger additional tool calls
Finally outputs the analysis conclusion

Note that since a 9B parameter small model is used, parameter parsing errors or unstable return data may occasionally occur. In actual enterprise development, it's recommended to connect to more powerful models like Qwen2.5 Plus, GPT-4, or Claude for stability.

Core Project Modules

The project contains the following core modules:

Model provider module: Encapsulates the Ollama local model's API interface, handling requests and responses
Tool system module: Implements the minimal tool set (file read/write, command execution, file search, etc.)
Agent Loop module: Implements the loop reasoning and tool scheduling logic — the central dispatcher of the entire system
Skills parsing module: Parses and loads predefined skill configuration files
Memory module: Implements file-based memory storage, supporting cross-session knowledge persistence

Insights from Claude Code's Architecture for Developers

The Best Path for Frontend Developers Transitioning to AI Agent Development

Claude Code's architecture demonstrates that AI Agent development can be fully accomplished using the TypeScript tech stack. Frontend developers don't need to learn Python or other languages from scratch — they can build fully functional agent applications using their existing Node.js and TypeScript skills. In fact, Claude Code itself is written in TypeScript, which fully proves the viability of the JavaScript/TypeScript ecosystem in Agent development. With the maturation of TypeScript-native AI development frameworks like Vercel AI SDK and LangChain.js, frontend developers have a natural technical affinity for the Agent development track.

Grasp the Essentials, Don't Get Swept Up by Concept Anxiety

From Prompt Engineering to Context Engineering, from Vibe Coding to Spec Coding to Headless Engineering — new concepts emerge endlessly. But the core remains unchanged: master the tools and understand the fundamental Agent architecture. Grasp the essentials of Agent Loop, tool systems, and memory mechanisms, and all these new concepts will naturally fall into place.

AI Tool Effectiveness Depends on the User's Domain Expertise

Many backend developers are asked to do frontend work through AI Coding, often with unsatisfactory results. No matter how powerful the tool, the final product quality still depends on the user's depth of understanding of domain knowledge. This is precisely the irreplaceable value of professional developers. As the core philosophy of "human-AI collaboration" reveals: AI excels at pattern matching and code generation, while architectural decisions, requirements understanding, quality control, and user experience judgment still require deep participation from human experts.

Conclusion

Claude Code's agent architecture provides us with a clear AI Agent development paradigm: system prompts define the role, the Agent Loop drives reasoning, the tool system connects to reality, and the memory mechanism accumulates experience. Master these four modules, and you have the foundational capability to develop any general-purpose agent.

For developers looking to dive deeper into AI Agent development, I recommend starting by building a Mini version of Claude Code, implementing the complete Agent Loop and tool calling flow hands-on. In the wave of AI reshaping software development, understanding and practicing these architectural concepts will be a developer's most important competitive advantage.

Key Takeaways

Developers' future core competitiveness focuses on two directions: AI-powered efficiency (proficient tool usage) and AI Agent development (agent design and implementation)
Claude Code's agent architecture comprises four core modules: system prompt, Agent Loop, tool system, and Skills with memory system
The Agent Loop is the agent's core execution mechanism, theoretically grounded in the ReAct paradigm, completing complex tasks through a cycle of model reasoning → tool calls → result feedback
The tool system relies on the Function Calling mechanism to enable LLM interaction with the external world — the key to an Agent being able to "take action"
The memory system is divided into four levels: session memory, project-level memory, team/user-level memory, and system-level memory
A Mini version of Claude Code can be built from scratch using TypeScript — frontend developers can get started with Agent development without switching tech stacks