Huawei HDC Deep Dive: Pangu 2.0 Goes Open Source and HarmonyOS 7 Agents Reshape the Mobile AI Ecosystem

Huawei open-sources Pangu 2.0 and embeds AI Agents into HarmonyOS 7, reshaping the mobile AI landscape.
At HDC, Huawei announced the full open-sourcing of Pangu 2.0 — a 505B sparse model with 2x throughput on Ascend chips — and HarmonyOS 7 with on-device 30B AI Agents. The moves aim to build a full-stack ecosystem (Ascend + MindSpore + Pangu + HarmonyOS + Kirin) and capture the next-gen Agent gateway, challenging Apple, WeChat, and others for control of traffic, data sovereignty, and future ad revenue.
At the Huawei Developer Conference (HDC), Richard Yu dropped two bombshells: the full open-sourcing of the Pangu large language model 2.0, and HarmonyOS 7 with deeply integrated AI. This isn't just a product iteration — it could be the pivotal turning point for the smartphone industry's transition from the APP era to the Agent era.
Pangu 2.0 Open Source: A Pragmatic Route Prioritizing Efficiency Over Parameters
Pangu 2.0 features a sparse architecture with 505 billion parameters and supports a 512K ultra-long context window. Frankly, this parameter count isn't jaw-dropping in an age where "trillion-parameter models are everywhere." But Richard Yu was refreshingly candid: computing power is insufficient, memory is too expensive, and Huawei is prioritizing efficiency over blindly scaling up.
Here's what the sparse architecture means technically. Unlike traditional dense models (which activate all parameters during every inference), sparse architectures use mechanisms like Mixture of Experts (MoE) to activate only a subset of parameters per inference. For example, a model with 505 billion total parameters might only activate tens of billions during any single inference pass, dramatically reducing computational overhead while maintaining model capacity. Google's Switch Transformer and Mixtral employ similar approaches. The 512K ultra-long context window means the model can process approximately 400,000 Chinese characters in a single input — critical for long-text scenarios like document analysis and code comprehension, and far exceeding GPT-4 Turbo's 128K window.
The truly noteworthy data point is this: On Ascend chips, Pangu's single-card throughput is twice that of mainstream open-source models in the industry. In other words, tasks that require two cards for competitors can be handled with just one card by Pangu. Half the parameters, double the efficiency — this reflects Huawei's hard-earned engineering optimization prowess.
Throughput is a core metric for measuring AI inference efficiency, typically measured in tokens processed per second. Huawei's achievement involves multi-layered engineering optimizations: operator fusion (merging multiple computation steps into a single execution), memory optimization (such as efficient attention mechanisms like FlashAttention), and deep adaptation of Huawei's proprietary CANN (Compute Architecture for Neural Networks) heterogeneous computing architecture to the Ascend 910B chip. This kind of hardware-software co-optimization capability — similar to how NVIDIA's CUDA ecosystem empowers its GPUs — represents a competitive moat that pure software companies cannot easily replicate.
More importantly, there's a strategic shift. Starting June 30, all seven core components of Pangu 2.0 will be fully open-sourced, including training code and training operators.

The old Pangu was an enterprise product — regular developers could only pay to call APIs. Now it's become an open foundation model. This isn't charity; it's a carefully designed ecosystem play: Ascend chips + MindSpore framework + Pangu model + HarmonyOS + Kirin chips — all five pieces are self-developed and fully integrated end-to-end. The more people use it, the more solid Huawei's technological foundation becomes.
Within this full-stack system, the MindSpore framework plays a crucial bridging role. As Huawei's self-developed deep learning framework (comparable to Google's TensorFlow and Meta's PyTorch), MindSpore adapts downward to the hardware characteristics of Ascend chips and supports upward the training and inference of large models like Pangu. Its core features include automatic parallelism (automatically distributing models across multiple chips for training) and graph-operator fusion (unified optimization of computation graphs and operator compilation). In the current context of US-China tech competition, PyTorch's support for Ascend chips carries uncertainty. MindSpore's existence ensures that Huawei's AI ecosystem won't be held hostage by policy changes in external frameworks.
HarmonyOS 7 AI Transformation: How On-Device Agents Change Human-Computer Interaction
If Pangu's open-sourcing is lowering the ladder for developers to come aboard, HarmonyOS 7 is the real engine. Richard Yu's definition: HarmonyOS 7 is the first operating system to complete a full AI transformation.
Specifically, the Pangu large model is embedded directly into the system kernel. The device can run a 30B-parameter model on-device, without internet connectivity. The success rate for complex tasks exceeds 90%. This means AI inference runs entirely on the phone locally, ensuring both response speed and preventing private data from being uploaded to the cloud.
Running a 30B (30 billion) parameter large model on a smartphone is an extremely challenging engineering feat. At FP16 precision, a 30B model requires approximately 60GB of memory — far exceeding the memory capacity of any current smartphone. To achieve on-device operation, multiple model compression techniques must be employed in combination: quantization (reducing FP16 to INT4 or even lower precision, compressing model size by 4-8x), pruning (removing unimportant parameter connections), knowledge distillation (using a large model to guide the training of a smaller one), and more. Huawei's Kirin chips include a dedicated NPU (Neural Processing Unit), with the latest generation reportedly delivering over 45 TOPS of computing power. Combined with on-device optimized models, this enables usable inference speeds. Apple's A17 Pro and Qualcomm's Snapdragon 8 Gen 3 are also ramping up NPU capabilities — on-device AI has become a core competitive dimension for chip manufacturers.

Here's a scenario: you tell your phone "I want to go hiking this weekend," and the system's AI assistant automatically works across apps to recommend routes, gear, weather information, and even helps you contact companions. This isn't a simple voice command — it's an Agent that understands intent and autonomously plans and executes.
It's worth clarifying the fundamental difference between AI Agents and traditional voice assistants. Traditional voice assistants like Siri and Xiao AI are essentially "command-response" systems: users issue explicit commands, and the system executes a single action (like "set an alarm" or "play music"). Agents, however, possess three key capabilities: intent understanding (inferring users' true needs from ambiguous natural language), task planning (decomposing complex needs into multiple subtasks and determining execution order), and tool invocation (autonomously calling multiple apps' APIs to complete each subtask). This relies on the reasoning capabilities of large language models and Function Calling mechanisms. Academia refers to this capability framework as ReAct (Reasoning + Acting) — the model reasons before executing each step, then decides the next action, forming a closed-loop autonomous decision chain.
Looking back, when HarmonyOS was first announced in 2019, it was mocked as a "PowerPoint OS." Today, it has 1.3 billion devices online and 11 million developers on board, making it China's second-largest smartphone operating system. This transformation took a full seven years.
The Battle for the Agent Gateway: Why Apple, WeChat, and Huawei Are Going All-In
Apple is strengthening Siri's Agent capabilities, WeChat is beta-testing AI agents, and Huawei has written Agent directly into the system kernel. Behind this race is a battle for the next-generation internet gateway.
Redefining the Traffic Gateway
For the past 15 years, the smartphone traffic gateway has been the app icon — wherever users tap, that's where traffic flows. But the Agent era operates on entirely different logic: users no longer need to manually open any app; they simply state their needs, and the Agent completes tasks across apps. Whoever controls the Agent controls the traffic lifeline of every app — this is worth far more than a hundred times the value of WeChat's mini-program grid.
The Battle for Data Sovereignty
A truly "knows you" Agent needs access to your calendar, location, spending habits, chat history, and even your bank balance and social connections. This data used to be scattered across dozens of apps; the emergence of Agents unifies it all.

This is the deeper reason Huawei insists on embedding the model in the kernel and running it locally — data stays on the phone, security belongs to the user, but the ecosystem belongs to Huawei. It's an elegant balance.
The Endgame for Business Models
When you tell your Agent "I'm hungry," it recommends three restaurants — who ranks first? Bid-based ranking. You say "buy running shoes," and the Agent compares prices across apps — which store appears at the top of the recommendation list? Whoever offers the highest revenue share wins.
Future internet advertising won't be placed inside apps but within the Agent's recommendation stream. This is a market worth trillions annually. Whoever captures the Agent gateway first takes the biggest commercial chip for the next decade.

Understanding these three layers of logic makes it clear why Apple is rushing to revive Siri, WeChat is testing AI agents, and Xiaomi will inevitably follow suit — inaction means handing over the data gateway and monetization channel for hundreds of millions of users.
HarmonyOS Star Shield: AI-Driven Chip-Level Security
Also worth mentioning is the HarmonyOS Star Shield feature, a prime example of Huawei applying AI capabilities to mobile security. It enables chip-level fraud interception, including identifying risks from overseas calls, detecting deepfake videos, and blocking malicious QR codes. AI isn't just being used to boost efficiency — it's also building stronger security defenses.
The so-called "chip-level" security involves a core technology known as TEE (Trusted Execution Environment). TEE carves out an isolated secure zone within the processor — even if the operating system is compromised, data and code within this zone remain protected. Huawei's implementation is based on the ARM TrustZone architecture with proprietary extensions, deploying the AI inference engine within the secure zone. This means the execution of AI models for deepfake detection and fraud identification is itself protected at the hardware level, preventing malicious software from tampering with the AI's judgments. This approach of deeply integrating AI capabilities with hardware security offers higher trustworthiness and attack resistance than purely software-based security detection.
Summary: The Strategic Intent Behind Huawei's Full-Stack Self-Developed System
The core message from Huawei's HDC can be summed up in one phrase: from "building tools" to "building brains." Pangu 2.0's open-sourcing lays the infrastructure for the AI ecosystem, HarmonyOS 7's embedded Agent redefines human-computer interaction, and the full-stack self-developed system of Ascend + Kirin + MindSpore + Pangu + HarmonyOS forms a moat that competitors will find extremely difficult to replicate.
The Mate 90 is expected to debut this fall with this complete solution. Whether the APP era is truly coming to an end remains to be seen, but the trend of smartphones evolving from communication tools into personal AI assistants is already irreversible.
Related articles

Building a Cold Chain Logistics Optimization Research Project with Codex: A Complete Workflow from Scratch to PDF Paper
Learn how to use OpenAI Codex to build a complete cold chain logistics optimization research project from scratch, including simulated annealing implementation, experiments, figures, and LaTeX paper compilation.

Codex Beginner's Practical Guide: Master Core AI Programming Skills in One Weekend
OpenAI Codex beginner's practical guide covering environment setup, code generation, bug fixing, and project refactoring. Includes efficient learning tips and Prompt techniques for fast AI programming mastery.

AI Agent Systematic Learning Path: From Zero to Independent Development
A systematic AI Agent learning path covering core principles, Prompt engineering, RAG, multi-Agent collaboration, and hands-on projects for beginners.