Siri's Absence from WWDC2025: Analyzing Apple's Fragmented AI Breakout Strategy

Siri missed WWDC2025; Apple fragmented its AI into independent feature modules, promising full integration by 2026.
At WWDC2025, Siri unexpectedly missed its debut due to failing on-device and cloud collaboration standards. Apple broke its AI strategy into independent feature modules—Visual Intelligence, real-time translation, AI Phone Assistant, and fitness coaching—all emphasizing on-device processing and privacy protection. This fragmented approach reflects Apple's meticulous craftsmanship philosophy but also reveals concerns about lacking a unified AI interaction entry point and a potentially widening gap with competitors. Apple has promised a truly integrated intelligent assistant by 2026.
The Unexpected Absence of the Core Protagonist
At Apple's Worldwide Developers Conference (WWDC2025), a surprising fact emerged: Siri, which was supposed to be the strategic core of Apple Intelligence, failed to make its expected debut. According to official statements, Apple was unable to meet the release standards for an on-device and cloud collaborative AI assistant on schedule, forcing the company to break down the intelligent interaction tasks originally carried by Siri into multiple independent feature modules.
On-device and Cloud Collaboration refers to an architecture where AI models run simultaneously on local devices and cloud servers, dynamically allocating computing resources based on task complexity. Simple tasks (such as text completion) are handled by lightweight on-device models, while complex tasks (such as multi-turn reasoning) are sent to large cloud-based models. The challenges of this architecture lie in: how to complete device-cloud switching within millisecond-level latency, how to protect privacy during data uploads (Apple's Private Cloud Compute technology requires immediate destruction of data after cloud processing), and how to make the switching imperceptible to users. Apple first proposed this vision at WWDC2024, but still hasn't met the standard a year later, indicating that the engineering difficulty of balancing privacy protection with AI capabilities far exceeds expectations.
This decision reflects Apple's consistent product philosophy—better to delay than compromise on quality. But for users and developers eagerly awaiting Apple's push into AI, this is an unmistakable signal: Apple's AI integration journey is more difficult than anticipated.

Breaking Down the Fragmented AI Features
Visual Intelligence: iOS Version of "Circle to Search"
Apple's Visual Intelligence feature is widely seen as an iOS upgrade of Android's "Circle to Search." Users simply long-press the screenshot button, and the system automatically analyzes the current screen content. Whether it's product images on social media or itinerary information in emails, users can search for similar items or intelligently extract dates and addresses with a single tap.
"Circle to Search" was first introduced by Google in early 2024 alongside the Samsung Galaxy S24 series, allowing users to draw a circle on any interface to trigger visual search. Its underlying technology relies on multimodal AI models (such as Google Lens's visual understanding capabilities), capable of recognizing objects, text, landmarks, and more in images and returning search results. Apple's Visual Intelligence adopts a different interaction paradigm—long-pressing the screenshot button rather than manually drawing a circle—which reduces precision requirements, but the core technical approach is similar: both require an on-device Vision Foundation Model to perform semantic segmentation and entity recognition on screen content, then pass structured information to a search or action engine.
The highlight of this feature is its system-level integration capability—not limited to specific apps but covering all on-screen scenarios, reflecting Apple's approach of embedding AI at the system level.
Dual Translation Safeguards: Privacy-First Real-Time Communication Translation
Leveraging on-device AI models, Apple introduced an impressive dual translation solution for communication scenarios:
- FaceTime video calls: Real-time bilingual subtitles float semi-transparently at the bottom of the screen without obscuring the person's image
- Traditional phone calls: The system simultaneously broadcasts translated audio for both parties
All translation processing is completed on-device, ensuring zero privacy data uploads. This is Apple's core differentiator from other AI solutions—treating privacy protection as the baseline for AI features rather than an optional add-on.
The technical foundation for Apple's insistence on on-device processing is the Neural Engine in its custom-designed chips. Starting from the A11 Bionic chip, Apple has continuously increased on-device AI computing power, with the M4 chip's Neural Engine reaching 38 TOPS (trillion operations per second). Running AI models on-device means user data never needs to leave the phone, architecturally eliminating data breach risks. However, the trade-off is limited model parameter size—devices can typically only run models with fewer than 3 billion parameters, while cloud-based large models often have hundreds of billions of parameters. This explains why Apple's translation and speech recognition features can be localized, but a unified AI assistant requiring deep reasoning still needs on-device and cloud collaboration.
AI Phone Assistant: Blocking Spam at the Source
To address the rampant spam call problem, Apple introduced an AI Phone Assistant. When an unknown number calls, the system automatically answers and collects the caller's purpose through voice interaction—sales, delivery, personal matters, etc. Users can decide whether to pick up based on the AI's classification results, blocking spam at the source.
The design philosophy behind this feature is quite clever: rather than a simple number blacklist mechanism, it uses AI to understand caller intent, putting users in control. The core technology is Intent Classification within Natural Language Understanding (NLU), where the system needs to quickly determine which predefined category the caller's purpose falls into within the first few sentences. This involves chained inference of Automatic Speech Recognition (ASR), semantic understanding, and classification models. Unlike traditional number-tagging databases, this approach can handle new numbers that haven't been flagged and doesn't rely on third-party data sharing. Technical challenges include: how to achieve accurate classification within the first few seconds of conversation, how to handle ambiguous expressions (such as telemarketers disguising themselves as delivery notifications), and how to design natural AI response scripts that encourage callers to state their intent.
Apple Watch Fitness Coach: Machine Learning Empowering Exercise
In the health domain, Apple Watch's AI fitness coach can analyze heart rate, pace, and other data in real-time, dynamically adjusting training plans. More innovatively, the system can use machine learning to replicate the voices of users' friends and family for encouragement, making exercise guidance more personalized and emotionally engaging.
This voice replication feature is based on Voice Cloning technology, which typically requires the target speaker to provide several minutes of voice samples, after which the AI model can learn their timbre, intonation, and speaking rhythm to generate synthesized speech for any text. Apple previously launched the "Personal Voice" feature in iOS 17, allowing users to record 150 sentences to create a voice replica of themselves (originally designed for ALS patients). Extending this technology to friend and family voice encouragement scenarios requires strict authorization mechanisms—explicit consent from the person being cloned must be obtained to prevent Deepfake abuse. This is one of the reasons Apple repeatedly emphasizes privacy and authorization in its AI features.
Strategic Considerations Behind the "Craftsmanship" Aesthetic
Looking at the keynote as a whole, Apple's AI strategy can be summarized as "breaking the whole into parts." Each independent feature has been meticulously polished, reflecting Apple's signature craftsmanship aesthetic—pursuing the ultimate in single-point experiences rather than an all-encompassing AI platform.
However, this fragmented strategy also reveals concerns:
- Lack of a unified AI interaction entry point: Users need to trigger different features in different scenarios, resulting in higher learning costs
- The gap with competitors may widen: When Google, OpenAI, and others have already launched unified AI assistants, Apple's dispersed strategy may prevent users from perceiving its overall AI capabilities
- Developer ecosystem uncertainty: Without a unified AI framework, third-party developers struggle to build coherent intelligent experiences
The current industry trend for AI assistants is the "Unified Agent" model: users complete all tasks through a single conversational interface, with AI autonomously calling tools, APIs, and subsystems. OpenAI's ChatGPT, Google's Gemini, and Microsoft's Copilot are all evolving in this direction, with large language models serving as the "central brain" coordinating everything. Apple's fragmented strategy is closer to an "expert system combination" approach—each feature module is an independent vertical AI, individually optimized for specific scenarios. The former's advantages are lower user cognitive costs and stronger extensibility; the latter's advantages are controllable single-point experiences and clear privacy boundaries. Apple ultimately needs to find a compromise that maintains its privacy baseline while providing a unified experience.
The Promise and Suspense of 2026
The keynote's closing remarks were thought-provoking—Apple repeatedly hinted that it is developing more powerful on-device and cloud collaborative capabilities for Siri, promising to launch a truly integrated Apple Intelligence assistant in 2026.
This means Apple has set itself a one-year window. During this year, the fragmented AI features serve as both a transitional solution and a testing ground for technical validation. User feedback from each independent module will provide data support for the final version of Siri's design.
The question is: will the competitive pace in the AI field give Apple this year of breathing room? When users have already grown accustomed to ChatGPT-style unified conversational experiences, can Apple's "craftsmanship" approach ultimately converge into a convincing holistic solution? The answer may not be revealed until 2026.
Key Takeaways
- Siri was absent from WWDC2025 due to failing to meet on-device and cloud collaboration standards; Apple broke AI features into multiple independent modules
- Visual Intelligence, real-time translation, AI Phone Assistant, and other features all emphasize on-device processing and privacy protection
- Apple adopted a fragmented AI strategy, pursuing ultimate single-point experiences but lacking a unified interaction entry point
- Apple promised to launch a truly integrated Apple Intelligence assistant in 2026
- The fragmented strategy serves as both a transitional solution and exposes the gap with competitors in AI integration capabilities
Related articles
Industry InsightsAI Product Development in Practice: Model Selection, Building Moats, and Paths to Commercialization
Practical strategies for AI product development: why not to train models from scratch, when to use APIs vs. fine-tuning, building product moats, and the full path from evaluation systems to commercialization.
Industry InsightsNo Product Fits Your Needs? Building It Yourself Is the Best Starting Point for Indie Developers
Can't find a product that fits? Building from personal pain points is the best entry for indie developers. Niche needs + AI tools = rapid product creation.
Industry InsightsOpenAI Codex Tutorials Mass-Copied on Bilibili, Highlighting AI Content Farm Problem
At least 9 Bilibili accounts mass-published identical OpenAI Codex tutorial videos, exposing content farm operations in the AI tools space.