Cursor 2.0 Deep Dive: Hands-On Testing of Five Major Features Including Custom Models and Multi-Agent Parallel Development

Overview: From VS Code Fork to $10 Billion AI Coding Giant

Cursor, the AI coding IDE beloved by vibe coders everywhere, has grown from zero to a $9.9 billion valuation in just a few months. Its recipe for success seems simple—fork Microsoft's VS Code and deeply integrate AI capabilities—but what truly sets it apart is its precise targeting of programmers who "understand code but hate writing it."

VS Code (Visual Studio Code) is an open-source code editor released by Microsoft in 2015, built on the Electron framework, with its core code open-sourced under the MIT license on GitHub. This means any company can legally fork its codebase and build their own product on top of it. Cursor leveraged this open-source advantage, inheriting VS Code's massive plugin ecosystem (over 40,000 extensions), familiar user interface, and mature editor features, while deeply customizing AI integration capabilities on top of it all. This strategy allowed Cursor to avoid the enormous engineering cost of building an IDE from scratch, concentrating resources on AI differentiation.

As for "Vibe Coding," it's a development paradigm that emerged in 2024-2025, popularized by prominent AI figures like Andrej Karpathy. The core idea is that developers no longer write code line by line but instead describe their intent in natural language, letting AI generate most of the code while developers focus on direction and result verification. Cursor is the most popular tool in this cultural wave, with its user base including both senior engineers using it to accelerate daily development and entrepreneurs using it to rapidly validate product ideas.

Now, Cursor 2.0 has officially launched with five major updates, marking the company's transformation from an "AI wrapper tool" to a genuine AI coding platform.

Cursor 2.0 Launch

Custom Composer Model: Balancing Speed and Intelligence

No More Waiting—Lightning-Fast Responses

The most eye-catching change in Cursor 2.0 is the launch of their custom model, Composer. The company claims this model approaches the intelligence level of top frontier models (like GPT-5 and Claude) while achieving significantly faster inference speeds.

This matters enormously. Large language model inference speed is typically measured in tokens per second. In coding scenarios, a typical function might contain 200-500 tokens—if the model infers at 50 tokens/s, generating a complete function requires 4-10 seconds of waiting. This latency severely disrupts a developer's flow state during frequent interactions. Technical approaches to improving inference speed include model distillation (compressing large model knowledge into smaller models), speculative decoding, quantization, and task-specific model architecture optimization. Cursor's custom Composer model likely employs a combination of several of these techniques, dramatically increasing speed while maintaining code quality for a substantial improvement in development experience.

Benchmark Performance and Concerns

However, Composer's capability claims should be viewed with caution. Its benchmark data comes from internal closed-source evaluations—there's no direct public comparison with Claude, GPT-5, or Gemini, and it hasn't appeared on authoritative external benchmarks like LM Arena or SWEbench.

It's worth explaining why these benchmarks matter: LM Arena (formerly LMSYS Chatbot Arena) is a model evaluation platform maintained by the UC Berkeley team that uses human blind evaluation with an Elo scoring mechanism, widely regarded as one of the most credible model capability leaderboards in the industry. SWEbench is a benchmark specifically targeting software engineering capabilities—it extracts tasks from real GitHub issues, requiring models to locate problems in complete codebases and generate correct fix patches, with pass@1 rate as its core metric. In 2024, top models achieved approximately 50-70% on SWEbench-Verified. When a model only showcases results on internal closed-source benchmarks without participating in these public evaluations, the credibility of its capability claims is significantly diminished, as internal testing carries risks of data leakage and evaluation bias.

From hands-on testing, Composer's speed in UI component generation tasks does far exceed competitors—Claude comes second, with GPT-5 noticeably behind. But in terms of code quality, Claude and GPT-5 still edge ahead in certain complex scenarios. Interestingly, in a test generating Apple Liquid Glass-style buttons, Claude won with its elegant animations, GPT-5 performed surprisingly poorly, while Composer delivered pleasantly surprising results.

Overall, Composer shows impressive potential, but proving it can truly stand alongside top-tier models will require more transparent, public evaluation data.

Git Worktrees Integration: Multi-Agent Parallel Development

A Revolutionary AI Coding Workflow

The most groundbreaking feature in Cursor 2.0 is Git Worktrees integration. Git Worktree is a feature introduced in Git 2.5 (2015) that allows developers to create multiple working directories under the same Git repository, each checking out a different branch. Unlike traditional git clone or git stash, worktrees share the same .git directory and object database, making creation extremely fast without consuming additional repository storage. Essentially, it's a local copy of the code that won't conflict with the main Git workspace.

Cursor cleverly leverages this mechanism to enable multiple AI Agents working in parallel on the same task simultaneously. In AI coding scenarios, the value of this mechanism is clear: each AI Agent can freely modify code in an independent worktree without interfering with each other or affecting the developer's current working branch. Once complete, developers can use git diff to compare each Agent's output and selectively merge the best solution.

Imagine this scenario: you need to build a design system for a web application. You can simultaneously have Claude, GPT-5, and Composer each independently generate their own approach, then compare and choose the best result. This is no longer theoretical—it's an actual feature in Cursor 2.0.

From AI Assistant to "Digital Workforce"

This feature represents a fundamental shift in how AI coding tools are used. Previously, developer-AI interaction was linear—ask, wait, review, modify. Now, developers are more like project managers, simultaneously dispatching multiple AI Agents to push forward, review, and fix code in parallel.

Multi-Agent Systems are one of the core trends in AI engineering for 2024-2025. Their theoretical foundation comes from distributed artificial intelligence research, with the core idea of decomposing complex tasks among multiple specialized AI agents for collaborative completion. In programming, representatives of this paradigm include Devin (a fully autonomous AI software engineer from Cognition Labs), OpenAI's Codex Agent, and GitHub Copilot Workspace. Cursor's Git Worktrees integration represents a more pragmatic multi-agent approach—it doesn't pursue full autonomy but instead keeps human developers in the decision-making seat while leveraging parallelization to improve exploration efficiency. This human-AI collaborative multi-agent model may be more suitable for production environments than fully autonomous agents.

This "parallel-driven" development model is especially valuable for projects requiring rapid iteration, enabling multiple independent solutions to be obtained in a short time for comparative decision-making.

Agent View Mode: An Interface Optimized for Conversational Development

Cursor 2.0 introduces a new Agent View mode—a UI restructuring targeting "conversation-intensive development" scenarios. When developers frequently engage in multi-turn conversations with AI, traditional editor layouts become cluttered. Agent View mode cleans up the interface, letting developers focus more on the AI interaction flow.

This might seem like a minor change, but for users who treat AI as their core development partner, the cumulative experience improvement is significant. A clear conversation interface makes code review and iterative modifications much more efficient. This design philosophy reflects how AI coding tools are evolving from "an add-on feature of the editor" toward "a development environment centered on AI conversation"—the code editor is no longer the sole protagonist, and the AI conversation window is gaining equal or even higher interface priority.

Built-in Browser: A Productivity Powerhouse for Frontend Development

Precisely Pinpointing UI Issues

For frontend developers, Cursor 2.0's native built-in browser may be the most practical update. When handling complex UI features, AI often produces code that's "close but not quite right." Previously, developers had to constantly switch between the IDE and browser, manually describing where the problem lies.

Now, with the built-in browser, developers can directly locate problematic HTML elements and add them to the AI conversation with a single click. Combined with full Chrome DevTools support, debugging information can be seamlessly passed to the AI for analysis and fixes.

Chrome DevTools is the built-in developer tool suite in Google Chrome, including the Elements panel (DOM/CSS inspection), Console (JavaScript console), Network (network request monitoring), Performance (performance analysis), and more. In traditional frontend development workflows, developers write code in the IDE, switch to the browser to preview results, open DevTools to locate issues, then return to the IDE to make changes—this context switching can happen hundreds of times per day. The core value of Cursor's built-in browser is eliminating this switching cost. By connecting DevTools' DOM selector with the AI conversation system, developers can directly "point at" a problematic element and tell the AI "this is wrong" instead of describing the element's position and styling issues in text, dramatically reducing information loss in human-AI communication.

A Closed-Loop Development Experience

This means the entire cycle of "code → preview → identify issues → precise feedback → AI fix" can be completed within Cursor without switching contexts. For UI-intensive projects, this is a genuine boost to frontend development efficiency.

Conclusion: Is Cursor 2.0 Worth the Upgrade?

Cursor 2.0's five major updates—the custom Composer model, Git Worktrees multi-agent parallelism, Agent View mode, built-in browser, and overall UI optimization—all point in one direction: pushing AI coding from "assistive tool" toward "collaborative platform."

Of course, Cursor's core limitations remain: its capability ceiling largely depends on the quality of underlying models, and the true level of its custom model still awaits verification. Additionally, the cost of advanced features is something developers need to consider—as the original video puts it, "the only limit to your potential isn't imagination, but the balance in your bank account."

But what is true is that Cursor is defining new standards for AI-assisted programming. For programmers pursuing development efficiency, version 2.0 is worth serious exploration.

Key Takeaways

Cursor launched its custom Composer model with speed far exceeding GPT-5 and Claude, though quality still needs external benchmark verification
Git Worktrees integration enables multi-AI Agent parallel development, revolutionizing AI coding workflows
The native built-in browser supports precise UI element targeting with direct AI feedback, creating a closed-loop development experience
The new Agent View mode optimizes the interface for conversation-intensive development scenarios
Cursor is transforming from an AI wrapper tool into a genuine AI company, though transparency of its closed-source benchmark testing remains questionable