Meta Partners with AWS: Bringing in Tens of Millions of Graviton Cores to Expand AI Infrastructure

Meta partners with AWS, adding tens of millions of Graviton cores to power AI inference at scale.
Meta has announced a major partnership with AWS to incorporate tens of millions of ARM-based Graviton processor cores into its computing portfolio. This is a key step in Meta's AI infrastructure diversification strategy, primarily targeting AI inference to provide large-scale, low-latency compute for Meta AI assistant and Agentic experiences serving billions of users. The deal signals that even hyperscale tech companies need external cloud services to rapidly scale amid explosive AI compute demand.
Partnership Overview
Meta has announced a major agreement with Amazon Web Services (AWS) to incorporate tens of millions of AWS Graviton processor cores into its computing resource portfolio. This partnership marks a critical step in Meta's AI infrastructure diversification strategy, aimed at providing more powerful computing support for Meta AI and Agentic experiences that serve billions of users.

AWS Graviton Processors: ARM Architecture's Cloud Powerhouse
What Are Graviton Processors
AWS Graviton is Amazon's custom-designed ARM-based server processor, optimized specifically for cloud computing workloads. Compared to traditional x86 architecture processors, the Graviton series offers significant advantages in performance-per-watt, delivering excellent computing performance at lower power consumption.
The Graviton processor has undergone three major generational evolutions. The first-generation Graviton (2018) was based on ARM Cortex-A72 cores, primarily used for lightweight workloads. Graviton2, released in 2019, achieved a quantum leap by adopting a 64-core Neoverse N1 architecture, delivering 40% better performance and 20% lower cost compared to equivalent x86 instances. Graviton4, launched in 2023, is based on the ARM Neoverse V2 architecture with 96 cores, featuring significantly improved memory bandwidth and compute density. This continuous iteration has transformed Graviton from "good enough" to "preferred choice," attracting major enterprises like Apple and Stripe to migrate core workloads to Graviton instances. The latest Graviton4 processor has demonstrated strong price-performance across a variety of workloads.
Why Meta Chose Graviton
For a company like Meta that needs to handle massive AI inference requests, choosing Graviton cores involves several key considerations:
- Energy Efficiency: ARM architecture's inherently low power consumption helps reduce operational costs for large-scale data centers
- Scale Deployment: Tens of millions of cores indicates this isn't a pilot project but a strategic-level infrastructure investment
- Diversification Strategy: Reducing dependence on a single hardware vendor while enhancing supply chain resilience
Strategic Significance: Diversifying AI Infrastructure
Expanding Compute from Training to Inference
Meta explicitly stated in its announcement that this partnership is part of its effort to "diversify AI infrastructure." In the current intensely competitive AI compute landscape, Meta has already invested heavily in NVIDIA GPUs for model training, while this AWS partnership likely focuses on AI inference — deploying trained models into production environments to serve users.
AI inference and training have fundamentally different compute requirements, and understanding this distinction is key to understanding Meta's choice of Graviton. The training phase requires backpropagation computations across massive datasets, relying on high-precision floating-point operations (FP32/BF16) and ultra-large memory capacity, making NVIDIA A100/H100 GPUs the undisputed choice. Inference, however, is entirely different: model weights are fixed, and the core challenge is processing user requests with minimal latency and maximum throughput. Inference can use INT8 or even INT4 quantization, dramatically reducing precision requirements, which makes CPUs and dedicated inference chips competitive. Industry estimates suggest that over 80% of AI compute costs for large internet companies come from inference rather than training — this is precisely the economic logic behind Meta's heavy bet on NVIDIA GPUs for training while turning to Graviton for inference.
Inference workloads are fundamentally different from training: they require large-scale, low-latency, high-throughput computing capability rather than extreme floating-point performance. Graviton processors may offer better cost-effectiveness than GPUs in these scenarios.
Serving Meta AI and Agentic Experiences
The announcement specifically highlighted two core application areas: Meta AI and Agentic experiences. Meta AI is a multimodal AI assistant built on the Llama series of large language models, deeply integrated into WhatsApp, Instagram, Facebook, and Messenger — four platforms with a combined monthly active user base exceeding 3 billion.
"Agentic experiences" represent a significant leap in AI capabilities — AI Agents can autonomously decompose complex tasks, invoke external tools (such as search, code execution, API calls), maintain context across multi-step reasoning, and ultimately accomplish user-delegated goals. These applications amplify compute demand exponentially: a single Agentic task may trigger dozens of model inference calls, and when multiplied by billions of users, daily inference requests could reach the trillions. Both types of applications require real-time responses for billions of users, and the introduction of tens of millions of Graviton cores is precisely designed to meet this explosive growth in compute demand.
Industry Impact and Future Outlook
The Delicate Relationship Between Cloud Giants
Meta, as a tech giant, choosing to use AWS infrastructure is uncommon in the industry. Typically, hyperscale companies like Meta prefer to build their own data centers and develop custom hardware. Meta disclosed in its 2024 financial report that capital expenditure reached $37 billion that year, with the majority going toward AI infrastructure construction — yet this still couldn't meet the demand gap.
This has given rise to a "hybrid infrastructure" strategy: placing core, stable workloads in proprietary data centers while outsourcing burst and elastic demand to cloud providers. AWS's advantage lies in its ability to rapidly provision tens of millions of cores worth of elastic compute without Meta bearing the corresponding construction timelines and capital risk. This model, known as "Cloud Bursting" in the industry, is becoming the new norm for infrastructure planning in the AI era. This partnership demonstrates that even a giant like Meta needs to leverage external cloud services to rapidly scale capabilities in the face of exponentially growing AI compute demands.
Boosting the ARM Server Ecosystem
ARM architecture's journey into data centers hasn't been smooth sailing. In 2016, SoftBank acquired ARM for $32 billion, providing more abundant R&D resources. The real turning point came with Apple's M1 chip launch in 2020, which proved ARM's viability in high-performance computing. On the server side, beyond AWS Graviton, Ampere Computing's Altra series, Microsoft Azure's Cobalt 100, and Google's Axion processor are all ARM-based, forming a competitive landscape that rivals x86 (Intel Xeon, AMD EPYC).
The core advantages of ARM servers include: more cores integrated at the same power envelope, higher performance-per-watt from the RISC instruction set, and more flexible customization options. An order of tens of millions of cores sends a powerful signal to the ARM server ecosystem, further validating ARM architecture's viability and competitiveness in the data center space. Meta's large-scale procurement may accelerate more enterprises' migration from x86 to ARM architecture.
Summary
This partnership reflects a new paradigm in AI-era infrastructure development: even the largest tech companies are seeking strategic partnerships to address unprecedented compute challenges. Meta's choice of AWS Graviton represents both a pursuit of cost efficiency and a strategic move toward supply chain diversification. As AI applications continue to permeate the daily lives of billions of users, similar large-scale infrastructure partnerships are likely to become increasingly common.
Key Takeaways
- Meta has reached an agreement with AWS to incorporate tens of millions of Graviton processor cores into its computing resource portfolio
- This partnership is a critical component of Meta's diversified AI infrastructure strategy, aimed at reducing dependence on a single vendor
- The additional compute capacity will primarily serve Meta AI assistant and Agentic experiences, covering billions of users
- ARM-based Graviton processors offer advantages in performance-per-watt, making them well-suited for large-scale AI inference deployment
- The partnership reflects that even hyperscale tech companies need external cloud services to rapidly scale when facing explosive AI compute demand
Related articles
Industry InsightsAI Product Development in Practice: Model Selection, Building Moats, and Paths to Commercialization
Practical strategies for AI product development: why not to train models from scratch, when to use APIs vs. fine-tuning, building product moats, and the full path from evaluation systems to commercialization.
Industry InsightsNo Product Fits Your Needs? Building It Yourself Is the Best Starting Point for Indie Developers
Can't find a product that fits? Building from personal pain points is the best entry for indie developers. Niche needs + AI tools = rapid product creation.
Industry InsightsOpenAI Codex Tutorials Mass-Copied on Bilibili, Highlighting AI Content Farm Problem
At least 9 Bilibili accounts mass-published identical OpenAI Codex tutorial videos, exposing content farm operations in the AI tools space.