Antigravity 2.0's New IDE Launch Goes Wrong: Bugs, UX Regression, and User Backlash

Antigravity 2.0's IDE redesign backfires with bugs, poor UX, and a developer trust crisis.
Antigravity 2.0's newly redesigned IDE has triggered widespread user backlash due to frequent bugs, degraded user experience, limited model support, and excessive Gemini Token consumption. Most damning is evidence suggesting the company's own developers don't use their product. The incident highlights critical lessons for the AI IDE space: stability must come before AI features, Token efficiency is essential, and teams must dogfood their own tools.
Introduction: When an IDE Loses Its Soul
Antigravity 2.0 recently released a redesigned IDE, but the update failed to receive the expected praise. Instead, user feedback has been overwhelmingly negative—severe bugs, poor user experience, limited model support, and frustrating Gemini Token quota consumption issues have plunged the new IDE into a storm of criticism. Even more telling, there are signs that Antigravity's own developers may not be using their own product for daily development work.

Rampant Bugs: Even Basic Functions Are Unstable
For any IDE product, stability is the absolute baseline requirement. Yet Antigravity 2.0's new IDE clearly fails to meet this standard. Numerous users have reported various bugs affecting daily use, from basic editor interactions to code completion—problems are virtually everywhere.
To understand the severity of this issue, consider the central role an IDE plays in a developer's workflow. An IDE (Integrated Development Environment) is the core productivity tool developers use 8-12 hours every day, with stability requirements comparable to an operating system. Mature IDEs like the JetBrains suite and VS Code typically undergo years of iterative refinement to achieve production-grade stability. VS Code's ability to capture over 70% market share in just a few years is largely thanks to its stable Electron-based architecture and the massive testing resources Microsoft invested. For emerging AI IDEs, they not only need to ensure stability in traditional IDE functions like editing, debugging, and version control, but also handle the additional complexity of AI inference requests—network latency, model response anomalies, context window management, and more are all new potential failure points.
If a development tool can't even guarantee a basic coding experience, then no matter how flashy the AI features are, they're just castles in the air. Users choose an IDE primarily to boost development efficiency, and frequent bugs don't improve efficiency—they add extra cognitive burden. This "negative optimization" experience has driven many early adopters to quickly revert to the old version or switch to competitors.
Major UX Regression: Redesign Doesn't Mean Better Design
Beyond the bugs, the new IDE's user experience design has also drawn widespread criticism. The so-called "redesign" hasn't delivered a more intuitive workflow—instead, it has added unnecessary complexity in many areas.
Severely Limited Model Support
In today's era of flourishing AI coding tools, the breadth and depth of model support is a core competitive advantage. However, Antigravity 2.0's model support is quite limited, directly impacting users' flexibility across different use cases.
In the current AI coding tool landscape, model diversity has become a key differentiator. Major LLMs include OpenAI's GPT-4o/GPT-4.1 series, Anthropic's Claude 3.5/4 series, Google's Gemini 2.5 series, and open-source options like DeepSeek and Llama. Different models have varying strengths in code generation, bug fixing, and code explanation—for example, Claude excels at long-context code understanding, while GPT-4o is more balanced in multilingual code generation. Leading products like Cursor allow users to freely switch models based on task characteristics, and even support bringing your own API Key to connect any compatible model. This flexibility greatly expands the product's applicability and user stickiness. When competitors like Cursor and Windsurf support multiple mainstream LLMs, Antigravity's shortcomings in model support become even more glaring.
Abnormal Gemini Token Quota Consumption
What frustrates users even more is that the new IDE rapidly burns through Gemini Token quotas during use. This means users not only have to endure a poor experience but also pay real money for these low-quality interactions.
To understand this, it's important to grasp Token economics in AI tools. A Token is the basic billing unit for LLMs—typically each English word corresponds to about 1-1.5 Tokens, while each Chinese character corresponds to about 1.5-2 Tokens. In AI IDE scenarios, every code completion or conversational interaction requires packaging the current code context, project structure, user instructions, and other information to send to the model, meaning a single request can consume thousands or even tens of thousands of Tokens. Common strategies for optimizing Token consumption include: intelligent context trimming (sending only code snippets relevant to the current task), request deduplication and caching (avoiding repeated requests for identical code snippets), incremental context updates (transmitting only changes rather than full context), and reasonable request triggering mechanisms (avoiding API calls on every keystroke). While the Gemini model offers a certain free quota in Google AI Studio, in high-frequency IDE usage scenarios, quota consumption far exceeds that of ordinary chat scenarios.
The inefficient Token consumption exposes serious deficiencies in API call optimization—likely involving unnecessary duplicate requests, poor context management, and other technical issues.
"Eating Your Own Dog Food": A Developer Trust Crisis
Among all the negative feedback, perhaps the most ironic discovery is this: there are indications that Antigravity's own development team uses other development tools in their daily work, rather than the IDE they built.
In the tech industry, "eating your own dog food" (Dogfooding) is a widely recognized product principle—if you're not willing to use your own product, how can you expect users to buy in? This concept originated from a 1988 email by Microsoft manager Paul Maritz, who encouraged his team to "eat our own dog food" to test Windows products. Since then, this practice has become a gold-standard in the tech industry. Microsoft requires the Office team to use their own Office suite daily, Google employees extensively use early versions of Gmail, Google Docs, and other products internally, and Apple engineers begin using prototype iPhones months before launch. In the IDE space, the JetBrains team uses IntelliJ IDEA to develop IntelliJ IDEA itself, and the VS Code team similarly uses VS Code for daily development. The value of this practice is clear: developers can discover problems from a real user's perspective, bug fix priorities better align with actual usage scenarios, and product decisions stay grounded.
This discovery fundamentally shakes user trust in the product team. It sends a dangerous signal: even the people who know this product best don't think it's good enough to use. When a product team doesn't use their own product, it often means there's a serious disconnect between the product and real needs.
Three Lessons for the AI IDE Space
This incident offers several important lessons for the currently hot AI IDE market:
First, foundational experience cannot be sacrificed for AI features. No matter how powerful the AI capabilities are, an IDE's core value still lies in providing developers with a stable, efficient coding environment. Rushing to push major updates before basic functionality is polished often backfires.
Second, Token economics is the lifeline of AI tools. As AI coding tools transition from free trials to paid subscriptions, users will become increasingly sensitive to Token consumption. How to optimize API call efficiency while maintaining feature quality is an engineering challenge every AI tool team must solve.
Third, product teams must be heavy users of their own products. Only by continuously using your own product in daily development can you truly understand user pain points and discover and fix issues at the earliest opportunity.
Conclusion
Antigravity 2.0's update is a textbook case of a product launch gone wrong. As competition in AI coding tools intensifies, this space is experiencing explosive growth—Cursor, built as a VS Code fork, rapidly rose in 2024 with its excellent AI code editing experience, reaching a multi-billion dollar valuation; Windsurf (formerly under Codeium) focuses on deep code understanding and automated workflows; GitHub Copilot, as the earliest AI coding assistant, continues to iterate leveraging the GitHub ecosystem and Microsoft's resources; and new players like Augment Code, Trae (under ByteDance), and Zed keep entering the market. What makes this space unique is that developers are the most demanding user group—they have extremely high requirements for tool performance, stability, and efficiency, yet migration costs are relatively low. Switching an IDE typically takes just a few minutes of configuration, meaning products must continuously deliver outstanding experiences to retain users.
The Antigravity team needs to quickly respond to user feedback, fix core issues, and re-examine their product development process—otherwise, in this rapidly reshuffling market, user attrition may happen far faster than they imagine.
Related articles

Complete Guide to Installing Claude Code CLI in China: Four Simple Steps
Step-by-step guide to installing Claude Code CLI in China using Node.js, Git, CC Switch, and an API relay service to bypass Anthropic's access restrictions.

The Compute Crisis: Why Google and Anthropic Are Paying SpaceX a Premium to Rent GPUs
Microsoft, Google, and Anthropic face severe compute shortages. Anthropic pays SpaceX $1B/month for GPUs. From TSMC capacity to HBM, storage, and power, the AI supply chain is in full crisis.

Mistral Le Chat Image Generation Review: Can It Replace Fable?
Mistral AI launches image generation in Le Chat, dubbed Le Chaton Fat. We analyze its capabilities, compare it with Fable, and explore the trend of AI chat platforms integrating image generation.