Siri's Absence from WWDC2025: Analyzing Apple's Fragmented AI Breakout Strategy

The Unexpected Absence of the Core Protagonist

At Apple's Worldwide Developers Conference (WWDC2025), a surprising fact emerged: Siri, which was supposed to be the strategic core of Apple Intelligence, failed to make its expected debut. According to official statements, Apple was unable to meet the release standards for an on-device and cloud collaborative AI assistant on schedule, forcing the company to break down the intelligent interaction tasks originally carried by Siri into multiple independent feature modules.

On-device and Cloud Collaboration refers to an architecture where AI models run simultaneously on local devices and cloud servers, dynamically allocating computing resources based on task complexity. Simple tasks (such as text completion) are handled by lightweight on-device models, while complex tasks (such as multi-turn reasoning) are sent to large cloud-based models. The challenges of this architecture lie in: how to complete device-cloud switching within millisecond-level latency, how to protect privacy during data uploads (Apple's Private Cloud Compute technology requires immediate destruction of data after cloud processing), and how to make the switching imperceptible to users. Apple first proposed this vision at WWDC2024, but still hasn't met the standard a year later, indicating that the engineering difficulty of balancing privacy protection with AI capabilities far exceeds expectations.

This decision reflects Apple's consistent product philosophy—better to delay than compromise on quality. But for users and developers eagerly awaiting Apple's push into AI, this is an unmistakable signal: Apple's AI integration journey is more difficult than anticipated.

WWDC2025 coverage screenshot

Breaking Down the Fragmented AI Features

Visual Intelligence: iOS Version of "Circle to Search"

Apple's Visual Intelligence feature is widely seen as an iOS upgrade of Android's "Circle to Search." Users simply long-press the screenshot button, and the system automatically analyzes the current screen content. Whether it's product images on social media or itinerary information in emails, users can search for similar items or intelligently extract dates and addresses with a single tap.

"Circle to Search" was first introduced by Google in early 2024 alongside the Samsung Galaxy S24 series, allowing users to draw a circle on any interface to trigger visual search. Its underlying technology relies on multimodal AI models (such as Google Lens's visual understanding capabilities), capable of recognizing objects, text, landmarks, and more in images and returning search results. Apple's Visual Intelligence adopts a different interaction paradigm—long-pressing the screenshot button rather than manually drawing a circle—which reduces precision requirements, but the core technical approach is similar: both require an on-device Vision Foundation Model to perform semantic segmentation and entity recognition on screen content, then pass structured information to a search or action engine.

The highlight of this feature is its system-level integration capability—not limited to specific apps but covering all on-screen scenarios, reflecting Apple's approach of embedding AI at the system level.

Dual Translation Safeguards: Privacy-First Real-Time Communication Translation

Leveraging on-device AI models, Apple introduced an impressive dual translation solution for communication scenarios:

FaceTime video calls: Real-time bilingual subtitles float semi-transparently at the bottom of the screen without obscuring the person's image
Traditional phone calls: The system simultaneously broadcasts translated audio for both parties

All translation processing is completed on-device, ensuring zero privacy data uploads. This is Apple's core differentiator from other AI solutions—treating privacy protection as the baseline for AI features rather than an optional add-on.

The technical foundation for Apple's insistence on on-device processing is the Neural Engine in its custom-designed chips. Starting from the A11 Bionic chip, Apple has continuously increased on-device AI computing power, with the M4 chip's Neural Engine reaching 38 TOPS (trillion operations per second). Running AI models on-device means user data never needs to leave the phone, architecturally eliminating data breach risks. However, the trade-off is limited model parameter size—devices can typically only run models with fewer than 3 billion parameters, while cloud-based large models often have hundreds of billions of parameters. This explains why Apple's translation and speech recognition features can be localized, but a unified AI assistant requiring deep reasoning still needs on-device and cloud collaboration.

AI Phone Assistant: Blocking Spam at the Source

To address the rampant spam call problem, Apple introduced an AI Phone Assistant. When an unknown number calls, the system automatically answers and collects the caller's purpose through voice interaction—sales, delivery, personal matters, etc. Users can decide whether to pick up based on the AI's classification results, blocking spam at the source.

The design philosophy behind this feature is quite clever: rather than a simple number blacklist mechanism, it uses AI to understand caller intent, putting users in control. The core technology is Intent Classification within Natural Language Understanding (NLU), where the system needs to quickly determine which predefined category the caller's purpose falls into within the first few sentences. This involves chained inference of Automatic Speech Recognition (ASR), semantic understanding, and classification models. Unlike traditional number-tagging databases, this approach can handle new numbers that haven't been flagged and doesn't rely on third-party data sharing. Technical challenges include: how to achieve accurate classification within the first few seconds of conversation, how to handle ambiguous expressions (such as telemarketers disguising themselves as delivery notifications), and how to design natural AI response scripts that encourage callers to state their intent.

Apple Watch Fitness Coach: Machine Learning Empowering Exercise

In the health domain, Apple Watch's AI fitness coach can analyze heart rate, pace, and other data in real-time, dynamically adjusting training plans. More innovatively, the system can use machine learning to replicate the voices of users' friends and family for encouragement, making exercise guidance more personalized and emotionally engaging.

This voice replication feature is based on Voice Cloning technology, which typically requires the target speaker to provide several minutes of voice samples, after which the AI model can learn their timbre, intonation, and speaking rhythm to generate synthesized speech for any text. Apple previously launched the "Personal Voice" feature in iOS 17, allowing users to record 150 sentences to create a voice replica of themselves (originally designed for ALS patients). Extending this technology to friend and family voice encouragement scenarios requires strict authorization mechanisms—explicit consent from the person being cloned must be obtained to prevent Deepfake abuse. This is one of the reasons Apple repeatedly emphasizes privacy and authorization in its AI features.

Strategic Considerations Behind the "Craftsmanship" Aesthetic

Looking at the keynote as a whole, Apple's AI strategy can be summarized as "breaking the whole into parts." Each independent feature has been meticulously polished, reflecting Apple's signature craftsmanship aesthetic—pursuing the ultimate in single-point experiences rather than an all-encompassing AI platform.

However, this fragmented strategy also reveals concerns:

Lack of a unified AI interaction entry point: Users need to trigger different features in different scenarios, resulting in higher learning costs
The gap with competitors may widen: When Google, OpenAI, and others have already launched unified AI assistants, Apple's dispersed strategy may prevent users from perceiving its overall AI capabilities
Developer ecosystem uncertainty: Without a unified AI framework, third-party developers struggle to build coherent intelligent experiences

The current industry trend for AI assistants is the "Unified Agent" model: users complete all tasks through a single conversational interface, with AI autonomously calling tools, APIs, and subsystems. OpenAI's ChatGPT, Google's Gemini, and Microsoft's Copilot are all evolving in this direction, with large language models serving as the "central brain" coordinating everything. Apple's fragmented strategy is closer to an "expert system combination" approach—each feature module is an independent vertical AI, individually optimized for specific scenarios. The former's advantages are lower user cognitive costs and stronger extensibility; the latter's advantages are controllable single-point experiences and clear privacy boundaries. Apple ultimately needs to find a compromise that maintains its privacy baseline while providing a unified experience.

The Promise and Suspense of 2026

The keynote's closing remarks were thought-provoking—Apple repeatedly hinted that it is developing more powerful on-device and cloud collaborative capabilities for Siri, promising to launch a truly integrated Apple Intelligence assistant in 2026.

This means Apple has set itself a one-year window. During this year, the fragmented AI features serve as both a transitional solution and a testing ground for technical validation. User feedback from each independent module will provide data support for the final version of Siri's design.

The question is: will the competitive pace in the AI field give Apple this year of breathing room? When users have already grown accustomed to ChatGPT-style unified conversational experiences, can Apple's "craftsmanship" approach ultimately converge into a convincing holistic solution? The answer may not be revealed until 2026.

Key Takeaways

Siri was absent from WWDC2025 due to failing to meet on-device and cloud collaboration standards; Apple broke AI features into multiple independent modules
Visual Intelligence, real-time translation, AI Phone Assistant, and other features all emphasize on-device processing and privacy protection
Apple adopted a fragmented AI strategy, pursuing ultimate single-point experiences but lacking a unified interaction entry point
Apple promised to launch a truly integrated Apple Intelligence assistant in 2026
The fragmented strategy serves as both a transitional solution and exposes the gap with competitors in AI integration capabilities