Connect Claude Code to DeepSeek V4: 3-Step Setup in 60 Seconds

Why Connect DeepSeek V4 to Claude Code?

Claude Code, Anthropic's AI programming terminal tool, only supports Claude models by default. Unlike IDE plugins such as Cursor or GitHub Copilot, Claude Code runs directly in the terminal environment, handling code generation, file operations, Git management, and more through command-line interactions. This design allows deep integration into a developer's Shell workflow — reading project files directly, executing system commands, and enabling true "agentic coding" where the AI doesn't just generate code but autonomously executes, debugs, and iterates.

With the release of DeepSeek V4, many developers want to leverage its capabilities within Claude Code's workflow — whether for cost savings or to benchmark different models on programming tasks. DeepSeek V4 uses an advanced Mixture of Experts (MoE) architecture: while the total parameter count is massive, only a subset of expert networks activates during each inference pass. This keeps reasoning costs far below those of comparably sized dense models. On multiple coding benchmarks (HumanEval, MBPP, SWE-bench), DeepSeek V4 demonstrates code generation capabilities on par with — or even surpassing — Claude 4 and GPT-4o.

The good news: with the community tool CC Switch, the entire setup takes just three steps and 60 seconds.

Complete Configuration Guide for Connecting Claude Code to DeepSeek V4

Step 1: Install Claude Code and Verify Your Environment

First, make sure Claude Code is properly installed. Verify by running this command in your terminal:

claude --version

If you see a version number, the installation is successful. If you haven't installed it yet, refer to the official documentation for npm installation. Claude Code runs on Node.js, so your system needs Node.js 18+ pre-installed. The installation command is typically npm install -g @anthropic-ai/claude-code, which registers it as a global CLI tool in your system PATH.

Step 2: Install the CC Switch Model Switching Tool

CC Switch is a community-developed model switching tool whose core function is helping Claude Code connect to different LLM APIs. After installation, open the CC Switch interface, select Claude Code, then click "Add Provider."

The tool's design philosophy is straightforward — it acts as a bridge between Claude Code and third-party model APIs, enabling a tool originally limited to Claude models to call other LLM services. Technically, CC Switch is a local API proxy service. It spins up a locally hosted endpoint compatible with the Anthropic API format. When Claude Code sends a request, CC Switch intercepts it, converts the Anthropic message format (including system prompts, tool use structures, etc.) into the target model's API format (such as OpenAI-compatible format), then forwards it to DeepSeek or other third-party services. When the response returns, it converts the result back into a format Claude Code understands. This protocol translation approach ensures the upper-layer application can seamlessly switch underlying models without any modifications.

Step 3: Configure the DeepSeek V4 API Connection

Complete the following steps:

Get your API Key: Go to the DeepSeek platform (platform.deepseek.com), register, and create an API Key. Note that the API Key is only displayed once at creation — copy and save it immediately to a secure location (such as a password manager). It's recommended to create separate keys for different use cases, making it easier to track usage and quickly rotate keys if one is compromised without affecting other services.
Enter the configuration: Paste the copied API Key into the CC Switch interface
Select the model: Choose DeepSeek as the provider and enter deepseek-v4-pro as the model name
Add a backup model (optional): If you also want to use the lightweight version, add deepseek-v4-flash
Test the connection: Click "Test" to confirm the connection works
Activate: Once the test passes, click "Activate"

How to Use It After Configuration

Once configured, return to your terminal and type claude to launch Claude Code — it will now use DeepSeek V4 for programming tasks.

Inside Claude Code, here are some useful commands to know:

/context: Check the current context length and monitor token consumption. This is crucial for cost control — DeepSeek V4 Pro supports an extra-long context window (128K tokens), but more input tokens mean higher per-request costs. By monitoring context length, you can proactively start a new session when conversations get too lengthy.
/model: Switch between models, toggling flexibly between DeepSeek V4 Pro and Flash
Adjust thinking level: Max out the thinking level via the EventMax parameter for deeper reasoning capabilities. The "thinking level" corresponds to the model's Chain-of-Thought reasoning mechanism. When set higher, the model performs longer internal reasoning before generating the final answer — decomposing problems, considering edge cases, verifying logic — which consumes more tokens and time, but for complex algorithm design or multi-file refactoring, deeper reasoning often produces significantly higher-quality code.

Usage Tips and Considerations

DeepSeek V4 Model Selection Strategy

DeepSeek V4 Pro: Best for complex code generation, architecture design, bug hunting, and other tasks requiring deep reasoning. Its MoE architecture activates more expert networks, delivering stronger reasoning at the cost of higher latency (first-token response typically 2–5 seconds).
DeepSeek V4 Flash: Best for simple code completion, formatting adjustments, and other lightweight tasks — faster response, lower cost. Flash is a distilled or streamlined version of Pro that retains core programming capabilities while dramatically reducing inference overhead, with first-token response typically under 1 second.

In practice, set Flash as your default model for everyday coding, and use the /model command to temporarily switch to Pro when tackling complex problems.

API Cost Comparison

DeepSeek V4's API pricing offers a significant cost advantage over the Claude 3.5/4 series. For developers who use Claude Code heavily, switching to DeepSeek V4 can substantially reduce API expenses. As a reference, Claude 4 Sonnet's input pricing is approximately $3 per million tokens with output at $15 per million tokens, while DeepSeek V4's pricing is typically an order of magnitude lower. For power users generating hundreds of thousands of tokens in interactions daily, the monthly cost difference can reach several hundred dollars. It's recommended to set up usage alerts and monthly budget caps on the DeepSeek platform to avoid unexpected high bills from automated scripts or long-running agent tasks.

Compatibility Notes

Since this CC Switch bridging approach is essentially API-level forwarding, all of Claude Code's interactive features (file read/write, command execution, etc.) work normally — only the underlying reasoning model changes. However, note that different models may vary in their support for tool use. Claude Code's agent capabilities heavily depend on the model accurately generating structured tool-call instructions (such as reading files, executing commands, etc.). If the target model's function calling isn't stable enough, you may occasionally encounter tool-call failures. DeepSeek V4's compatibility in this area is quite mature, but if you run into issues, try reducing task complexity or switching back to the Pro model.

Summary

With CC Switch, developers can flexibly call DeepSeek V4 and other third-party models from Claude Code without changing their workflow. The setup is quick and simple — just three core steps: install the tool → get your key → enter the configuration. If you're looking for a lower-cost alternative for Claude Code or want to compare different models' programming capabilities, this approach is worth trying.

This ability to flexibly switch models also represents an important trend in AI programming tools: the decoupling of the tool layer from the model layer. In the future, developers won't be locked into a single model ecosystem. Instead, they'll dynamically choose the most suitable underlying model based on task characteristics, cost budget, and response speed — as naturally as choosing different compilers or runtimes.