Why Engineering Teams Are Cutting AI Spending: The Shift from Hype to Rationality

Introduction

As the AI hype sweeps across the entire tech industry, a seemingly contradictory trend is quietly emerging: more and more engineering departments are beginning to cut AI-related spending. This observation comes from Gergely Orosz, author of the renowned tech newsletter The Pragmatic Engineer, who explored this phenomenon in depth in his latest "The Pulse" column.

The Pragmatic Engineer is one of the most influential engineering management newsletters in the tech industry, founded by former Uber engineering manager Gergely Orosz. Its "The Pulse" column focuses on tracking engineering practice trends within large tech companies and high-growth startups. Since its information sources primarily rely on the author's extensive industry network and anonymous surveys, it often captures early trends before they're covered by mainstream media.

Source screenshot

AI Spending Inflation: From Frenzied Investment to Sober Reflection

Over the past two years, nearly every engineering team has been investing heavily in AI tools and infrastructure. From subscription fees for AI coding assistants like GitHub Copilot, to API call costs for large language models, to GPU compute procurement — AI-related spending has become the fastest-growing segment of engineering department budgets.

To understand the scale of this spending, it helps to break down its specific components. Take GitHub Copilot as an example: this AI coding assistant, jointly developed by GitHub and OpenAI, is based on OpenAI's Codex model (a code-specialized variant of the GPT series) and generates real-time code suggestions based on code context. Its enterprise pricing starts at $19 per user per month — for a 500-person engineering team, the annual cost for this single tool alone approaches $120,000. LLM API costs are even harder to predict — vendors like OpenAI, Anthropic, and Google generally use per-token pricing (a token is the smallest unit of text processing, roughly equivalent to 0.75 English words), with GPT-4-level model input prices ranging from a few dollars to tens of dollars per million tokens. When teams integrate LLMs into multiple workflows such as code review, documentation generation, and test case writing, token consumption can skyrocket. On the GPU compute front, high-end AI chips like the NVIDIA H100 have been in short supply over the past two years, keeping cloud provider GPU instance prices elevated — a server equipped with 8 H100s can cost tens of thousands of dollars per month in rental fees.

However, as the initial excitement gradually fades, engineering leaders are confronting a practical question: What exactly is the ROI on these AI investments?

Many teams have found that the actual productivity gains from AI tools haven't met initial expectations. While AI coding assistants can indeed accelerate certain coding tasks, AI-generated code may actually create additional work during code review, debugging, and maintenance. Research shows that while AI-generated code is typically syntactically correct, it often lacks understanding of the project's overall architecture and may introduce subtle logic errors, security vulnerabilities, or patterns that don't conform to team coding standards. Developers need to spend extra time reviewing and correcting this code, and this "review tax" can in some scenarios even offset the coding speed gains AI provides. Meanwhile, API call fees and compute costs continue to climb.

Three Key Drivers Behind Engineering Teams Cutting AI Spending

Cost Control Under Macroeconomic Pressure

After experiencing massive waves of layoffs, cost-control awareness in the tech industry has significantly intensified. Between late 2022 and 2024, tech giants including Meta, Google, Amazon, and Microsoft collectively laid off hundreds of thousands of employees. The root cause of these layoffs was over-expansion during the pandemic followed by macroeconomic tightening. As one of the largest cost centers in any enterprise, engineering departments naturally became a focus of budget scrutiny. In this "efficiency-first" atmosphere, CFOs and engineering VPs demand stricter cost-benefit justification for every new expenditure. AI spending, as a "new" line item, is particularly easy to put under the microscope.

Difficulty Quantifying AI Tool Effectiveness

The productivity gains from AI tools are often difficult to measure precisely. A developer writes code faster with Copilot, but has overall delivery speed actually improved? Has code quality gotten better? These questions lack clear data support, making renewal decisions for AI tools challenging.

In fact, measuring software engineering productivity has been a long-standing debate within the industry. Currently recognized frameworks include Google's DORA metrics (deployment frequency, lead time for changes, change failure rate, time to restore service) and the SPACE framework proposed by GitHub in collaboration with academia (Satisfaction and well-being, Performance, Activity, Communication and collaboration, Efficiency and flow). However, these frameworks measure a team's overall engineering effectiveness — isolating the specific contribution of any single AI tool from these metrics is nearly impossible. If a team's delivery speed improved by 15%, was it because they introduced an AI coding assistant, or because they simultaneously performed a microservices decomposition, improved their CI/CD pipeline, or simply because team members became more familiar with the new project? This attribution difficulty means that the value proposition for AI tools remains at a qualitative level, unable to provide the quantitative answers that finance departments demand.

AI Costs Growing Far Beyond Expectations

Many teams underestimated the long-term costs of AI tools during the trial phase. When scaling from small pilots to full-team deployment, subscription fees, API call volumes, and infrastructure costs often grow exponentially, far exceeding initial budget plans.

This cost inflation follows several typical patterns. First is the "token consumption black hole" — once developers get accustomed to using LLMs for assistance, call frequency far exceeds expectations, especially when AI is integrated into automated pipelines (such as automated code review, auto-generated PR descriptions, and automated test generation), where API call volumes can be an order of magnitude higher than manual usage. Second is the "model upgrade trap" — as more powerful models like GPT-4o and Claude 3.5 Sonnet are released, teams naturally tend to upgrade to newer, more expensive models for better results, but each generation's pricing tends to increase as well (even though per-token prices may decrease, more powerful models encourage more complex use cases, actually increasing total spending). Third is "infrastructure sprawl" — to reduce latency or protect data privacy, some teams choose to self-host open-source models, but this requires procuring and maintaining expensive GPU servers, with operational costs far exceeding the original cloud API approach.

Cutting Doesn't Mean Abandoning: A More Rational AI Investment Strategy

Interestingly, cutting AI spending doesn't mean engineering teams are giving up on AI. More accurately, it represents a shift from a "cast a wide net" approach to AI procurement toward more targeted investment.

Selecting Core AI Tools Rather Than Deploying Everything

More and more teams are evaluating which AI tools truly deliver value and which are merely "nice to have." Rather than subscribing every developer to five or six AI tools, it's better to concentrate investment on one or two proven core tools.

The current AI coding tool market is already quite crowded: GitHub Copilot, Cursor, Windsurf (formerly Codeium), Amazon CodeWhisperer, Tabnine, Sourcegraph Cody, and others each have their own focus. Many teams tried multiple tools simultaneously during the exploration phase, but found significant feature overlap in actual use — most tools' core capabilities center on code completion and conversational programming. Consolidating to one or two core tools not only reduces subscription costs but also decreases the cognitive burden on developers switching between different tools, while simplifying security audits and compliance management for IT departments.

Establishing Internal AI Tool Evaluation Mechanisms

Mature engineering organizations are building evaluation frameworks for AI tools, using A/B testing and productivity metrics to quantify the actual impact of AI investments, letting data rather than intuition guide budget allocation.

Specifically, some leading engineering teams have begun adopting controlled experiment approaches: randomly dividing engineers into an experimental group that uses AI tools and a control group that doesn't, then tracking differences between the two groups over weeks to months across dimensions such as code commit frequency, PR merge time, bug introduction rate, and developer satisfaction. For example, Google has conducted large-scale randomized controlled trials on its internal AI coding assistant, and Microsoft has published internal research on Copilot's impact on developer productivity. While this rigorous evaluation approach has higher implementation costs, it provides far more reliable decision-making evidence than subjective impressions. Some teams have also started tracking "AI tool adoption rate" — the percentage of scenarios where developers actually choose to use AI tools when they're available — a metric that often genuinely reflects a tool's practical value.

Focusing on Total Cost of Ownership for AI Tools

Beyond direct subscription and API fees, teams are beginning to account for the hidden costs of AI tools — including the learning curve, integration maintenance, and the long-term maintenance costs of AI-generated code.

Total Cost of Ownership (TCO) is a classic concept in enterprise IT procurement, but in the AI tools domain, hidden costs are often severely underestimated. On the learning curve front, teams need to invest time training developers on how to effectively use AI tools and write high-quality prompts, during which productivity may actually decline. On the integration maintenance front, connecting AI tools to existing IDEs, CI/CD pipelines, and code repositories requires ongoing engineering effort, and frequent AI tool updates may cause integrations to break regularly. Most noteworthy is the long-term maintenance cost of AI-generated code — when the original developer leaves, their successor faces large volumes of AI-generated code lacking clear design intent, potentially making comprehension and maintenance significantly more difficult. The cumulative effect of this "technical debt" is hard to detect in the short term, but over time it could become a massive hidden expense.

What This Trend Means for the AI Industry

This trend sends an important signal to AI tool vendors: The halo effect of AI alone is no longer enough — products must demonstrate tangible business value.

From a macro perspective on technology adoption, the current adjustment in the AI tools market aligns closely with Gartner's Hype Cycle model. This model divides the adoption process for new technologies into five stages: Technology Trigger, Peak of Inflated Expectations, Trough of Disillusionment, Slope of Enlightenment, and Plateau of Productivity. The explosive growth of generative AI in 2023 corresponds to the peak of "Inflated Expectations," while engineering teams now beginning to rationally scrutinize AI spending is a classic signal of entering the "Trough of Disillusionment." Historically, technologies like cloud computing, microservices, and containerization all went through similar cycles — initial frenzied adoption, followed by sober reflection, and ultimately settling into industry-standard practices. AI tools will most likely follow this same path, and those products that can still prove their value during the "trough" will emerge as the ultimate winners.

For engineering teams, this is also a healthy adjustment process. After the initial AI frenzy, the industry is entering a more rational and pragmatic phase. AI tools that can clearly demonstrate ROI will continue to receive investment, while those with unclear results face the risk of being eliminated.

As Gergely Orosz has consistently observed through The Pragmatic Engineer, engineering practices at large tech companies and startups are evolving rapidly. The rational return on AI spending may well be one of the signs that this industry is maturing.

Conclusion: From Hype-Driven to Value-Driven

The trend of engineering departments cutting AI spending fundamentally reflects the transition of technology investment from hype-driven to value-driven. This isn't a failure of AI — it's the industry going through a necessary "de-bubbling" process. For practitioners, the key lies in building scientific evaluation systems to ensure every dollar of AI investment is spent where it matters most.

From a longer-term perspective, this rational return is actually beneficial for the healthy development of the AI tools ecosystem. When the market shifts from "buy anything with AI" to "prove value before buying," AI tool vendors will be forced to focus more on actual product utility rather than marketing narratives, pushing the entire industry toward higher quality. For engineering leaders, now is the ideal time to establish AI tool governance systems — setting clear evaluation criteria, building cost monitoring mechanisms, and cultivating team AI literacy. This foundational work will determine whether organizations can efficiently capture value when AI technology truly matures.