The Era of Managed Agents: Anthropic vs. Google's Two Diverging Approaches

Anthropic and Google take diverging approaches to Managed Agents, defining a new AI infrastructure category.
Managed Agents are emerging as a new product category where cloud providers host the entire agent runtime — sandboxing, state persistence, failure recovery, and credential management. Anthropic takes a depth-first approach, exposing full control with versioned agents, vaults, and memory stores. Google opts for simplicity with a single API call. Both carry hidden pricing traps and vendor lock-in risks that builders must carefully evaluate.
In 2025, AI Agent infrastructure is undergoing a profound transformation. Google has launched Managed Agents within the Gemini API, entering the arena right behind Anthropic. This marks the emergence of a new product category — Agents are no longer just model calls, but a fully managed runtime. If you're still building your own agent loops to handle long-running tasks, it's time to pay attention to this trend.
What Are Managed Agents?
In simple terms, a Managed Agent is a cloud-hosted runtime environment that executes the agent loop on your behalf. You only need to define three things: the model, the system prompt, and the tools. Then you send a message, and the platform handles everything else.
And "everything else" is precisely the most time-consuming part. As agent runtimes extend from seconds to minutes or even hours, the challenges you face grow exponentially:
- Sandbox environments: Spinning up containers for each session, managing file systems and package managers
- Persistent state: Maintaining state across tool calls, managing context windows
- Failure recovery: Handling network jitter, memory overflow, rate limits, and other exceptions
- Credential management: Securely managing OAuth tokens, preventing prompt injection leaks
- Container isolation: Ensuring security in multi-tenant environments
The Three-Layer Architecture Model for Managed Agents
Anthropic has proposed a highly useful architectural framework that decomposes agents into three decoupled components:
- Brain: The model and its decision loop — stateless by design, restartable at any time
- Hands: Sandboxes and tools — ephemeral and disposable; if one fails, spin up a new one
- Session: A persistent, append-only event log that exists outside the model's context window
The key value of this decoupled design: if the brain crashes, you can wake up a new brain, hand it the session log, and it can resume execution from the last recorded event. This is critical for long-running workloads.

Why Managed Agents Are a Genuine Product Category
It's worth emphasizing that all this infrastructure work — container orchestration, state persistence, failure recovery, credential management — none of it is the interesting part of building an agent. It's all infrastructure. And every company is repeatedly building the same plumbing.
This is exactly why cloud providers are absorbing this work. Managed Agents have become a genuine product category rather than just a feature because providers are now delivering not just models — but the runtime and the product you build on top of the model.
Take credential management as an example: if an agent acts on behalf of a user — say, sending a Slack message or reading a GitHub repository — you need OAuth tokens, token refresh mechanisms, and assurance that tokens never enter the sandbox environment. Building this system correctly takes weeks; hardening it takes months.

Anthropic vs. Google: Two Fundamentally Different Managed Agent Philosophies
Both companies have launched Managed Agents, but their implementations differ dramatically, reflecting a fundamental disagreement about what a Managed Agent should look like.
Anthropic: A Depth-First Agent Platform
Anthropic's API design exposes the full machinery. You need to separately create an agent, an environment, a session, and then send events to a specific session — four resources mapped to four different endpoints.
Core advantages:
- Agents themselves are versioned and support rolling updates
- Sessions stream typed events (message, tool_use, status, etc.)
- Supports mid-run interruption, injecting new messages, updating tools or MCP servers
- Complete set of pre-built tools: Bash, file operations, web search, MCP support
- Vaults: Manage and refresh credentials for each end user
- Memory Stores: Cross-session persistence with version records for every write
- Outcomes: Give the agent a scoring rubric, and the platform spins up an independent evaluator to assess work quality
- Dream: Asynchronous tasks that read historical sessions and consolidate memories into cleaner storage
Anthropic is essentially building the agent runtime as an operating system — virtualizing components, exposing stable interfaces, and absorbing the entire operations layer.
Google: A Simplicity-First Agent Approach
Google's approach is extremely streamlined. You need just one call: interaction.create, passing in the agent ID, input, and environment parameters. The agent plans, executes, observes results, loops until done, then returns the final output.

Current state:
- The toolset includes only code execution, Google Search, URL fetching, and file system access
- No MCP support, no custom function calling, no credential vault
- Customization is done through Markdown files rather than JSON
But here's an important detail: Google actually has a second Managed Agent product — the Gemini Enterprise Agent Platform. It uses the same underlying engine but with a feature set much closer to Anthropic's: MCP server support, OAuth credential manager, Memory Bank, skill registry, and even an inter-agent coordination framework. However, this version is currently in private preview, and Google explicitly warns against using it with confidential data.

The Pricing Trap of Managed Agents
Anthropic charges standard token rates plus 8 cents per active session. Google charges only token rates during the preview period, with sandbox compute being free.
On the surface, Google looks cheaper, but the reality is more nuanced. While Flash model per-token prices are lower than Opus, Gemini 3.5 Flash's per-token price is several times higher than the previous generation of Flash. And agent loops can consume 3 to 5 million tokens in a single run. Google's own tests show that at this scale, the cost per interaction is around $5.
Cheap per-token pricing does not equal cheap per-run costs. This is a critical point that's easy to overlook when making platform decisions.
Risks Builders Need to Watch Out For
Vendor Lock-In Risk
The obvious lock-in is API incompatibility — Anthropic's API doesn't interoperate with Gemini's. But the more insidious lock-in is what will actually wreck production systems:
These systems are inherently non-deterministic. Even within the same provider, the underlying model can change without your knowledge — system prompts get modified, models get quantized to reduce costs, safety behaviors get re-tuned. Every change can alter how the agent behaves, and these changes won't appear in any changelog.
The way you typically discover the problem: evaluation metrics start drifting, or users start complaining. Tool calls degrade, reasoning chains get shorter, tasks that completed fine last week suddenly start failing.
Compliance Limitations
Both products are stateful by design, which means neither currently meets zero-data-retention or HIPAA Business Associate Agreement requirements. Both are still in preview or beta, and pricing and feature sets are subject to change at any time.
How to Choose a Managed Agent Platform
In summary, the choice depends on where your differentiation lies:
- If your differentiation is in how the agent works — the tools it uses, the credentials it carries, the way it iterates toward a goal — Anthropic is the platform built for this today
- If your differentiation is in what the agent produces, and you want the simplest path to get up and running quickly — Google is the better fit
Regardless of which path you choose, invest in evaluation systems, continuously track output quality, and don't hardcode assumptions about model behavior into critical parts of your system.
This category is becoming the default way frontier AI providers deliver agent capabilities. Anthropic moved first, Google followed, and AWS and OpenAI are on the way. If you're building anything that runs longer than a single API call, the next question to answer is: Do you run this agent loop yourself, or hand it to a provider?
Related articles

Claude Code at One Year: A Programming Revolution from Single Agent to Agent Army
Claude Code turns one. From running thousands of Agents in parallel to Auto Mode replacing Plan Mode, explore how Anthropic's team is reshaping AI coding workflows.

Coze Agent in Practice: Building an AI Test Case Generation Workflow from Scratch
Learn how to build an AI test case generation agent on Coze, covering agent vs. LLM differences, workflow orchestration, model selection, and prompt engineering tips.

Anthropic's Latest Research: AI Recursive Self-Improvement Is Rapidly Approaching Human-Level Capability
Anthropic's new research reveals AI recursive self-improvement progress: Claude writes 80%+ of code, achieves 52x training speedup, and outperforms humans at 64% of research decision points.