Deep Dive into Google I/O 2025's Three Major Android Developer Productivity Updates

Google I/O 2025 has wrapped up, but the major updates around Android developer productivity deserve a thorough analysis. As the era of AI-driven development (Agentic Development) accelerates, Google released key updates around three pillars—Android CLI, Android Skills, and Android Bench—with a clear objective: enabling developers to efficiently build Android apps using any AI tool, Agent, or LLM.

Agentic Development is one of the most significant paradigm shifts in software engineering during 2024-2025. Unlike traditional AI code completion (such as GitHub Copilot's single/multi-line suggestions), Agentic Development emphasizes AI Agents that can autonomously plan and execute multi-step development tasks—from understanding requirements, designing solutions, and writing code to debugging and testing, with the entire workflow semi-autonomously completed by the Agent. Representative products of this paradigm include Devin, Cursor Agent Mode, and Claude Code. Google's moves here are specifically designed to ensure the Android ecosystem isn't marginalized by this wave.

Android CLI Stable Release: Giving AI Agents Access to IDE Capabilities

The Android CLI (command-line tool) has been announced as reaching stable status—one of the most practically impactful announcements at this year's I/O. Traditionally, Android Studio, as an IntelliJ IDEA-based IDE, has its code analysis capabilities (such as semantic understanding, reference finding, and type inference) encapsulated within the GUI, inaccessible to external tools programmatically. The new Android CLI exposes capabilities at a level similar to the Language Server Protocol (LSP), allowing AI Agents to obtain a project's semantic information as if calling an API—a major architectural breakthrough.

The new version brings several key capabilities:

Programmatic Version Lookup: Developers can automatically query SDK, toolchain, and other version information through scripts, greatly simplifying version management in CI/CD pipelines. Version management has always been a headache in Android continuous integration—Android projects depend on numerous component versions, including Gradle version, Android Gradle Plugin version, Kotlin version, Compose Compiler version, and various AndroidX library versions, all with complex compatibility matrices. Programmatic version lookup allows CI scripts to automatically detect the current environment's SDK versions, confirm compatibility, and avoid build failures due to version mismatches—especially important in large teams and monorepo architectures.
Journeys Support: The CLI now natively supports the Journeys feature, making automated testing and development workflows smoother. Journeys is a natural language-based end-to-end UI testing approach introduced in Android Studio—developers describe user action paths in natural language (e.g., "Open the app, tap the login button, enter username and password, verify navigation to the home page"), and the system automatically converts them into executable UI tests. Compared to traditional Espresso or UI Automator tests, Journeys significantly lowers the barrier to writing tests and offers greater robustness against UI changes. CLI support for Journeys means these tests can run in headless environments, perfectly fitting CI pipelines.
Deep Integration with Android Studio: This is the most noteworthy highlight—any AI Agent can access Android Studio's powerful capabilities through the Android CLI, including IDE-level functions like analyzing files, finding references, and locating declarations.

Android CLI integration with Android Studio

What does this mean? Simply put, AI Agents are no longer "blind" code generators—they can now understand project structure, trace code reference relationships, and analyze file dependencies like a real developer. This opening of IDE capabilities is a critical step by Google in the Agentic Development direction.

Additionally, Google Anti-Gravity (Google's AI development platform) now officially supports Android development, providing complete integration of Android CLI and Skills through the Android Resources Bundle. This gives developers a unified entry point for accessing Android AI development resources.

Android Skills Continues to Expand: Bridging the Knowledge Gap Between LLMs and Android Development

Android Skills are "specialized knowledge packages" custom-built by Google for LLMs, injecting specialized workflows and domain knowledge into large language models to help AI better understand and handle common yet complex Android development scenarios.

Android Skills growing

In this update, Android Skills coverage has expanded significantly with the following new areas:

Adaptive UI: Helps LLMs understand how to build UIs that adapt to different screen sizes and device form factors. With the diversification of Android device form factors—from phones, tablets, and foldables to car displays, TVs, and Wear OS watches—developers need to build interfaces that gracefully adapt to different screen sizes, orientations, and input methods. Google recommends using Window Size Classes to categorize screens into Compact, Medium, and Expanded tiers, combined with Jetpack Compose's adaptive layout components. The complexity here lies in simultaneously considering changes in layout, navigation patterns, and interaction paradigms—a high-difficulty scenario where LLMs easily make mistakes, making dedicated Skills support particularly necessary.
Glimmer for XR (Display Glasses Development): Development skills for XR devices, reflecting Google's investment in spatial computing. Glimmer is Google's development framework for the Android XR platform (including the Project Moohan headset developed with Samsung and future AR glasses). XR development fundamentally differs from traditional mobile development: it requires handling 3D spatial layouts, gesture tracking, eye-tracking interaction, spatial audio, and other entirely new dimensions. Glimmer extends spatial UI capabilities on top of Jetpack Compose, allowing Android developers to leverage existing skills to enter the XR space. The addition of this Skill indicates Google is preparing at the developer tools level for the coming spatial computing wave.
Perfetto SQL: Specialized skills in performance analysis, enabling AI to assist developers with deep performance tuning. Perfetto is Google's open-source system-level tracing tool that captures low-level system events like CPU scheduling, memory allocation, GPU rendering, and Binder calls. Perfetto SQL is its query language, allowing developers to perform structured queries and analysis on massive trace data using SQL-like syntax—for example, querying "frame rendering events exceeding 16ms on the main thread" to identify jank causes. This domain is highly specialized, and typical LLMs have virtually no relevant knowledge. The addition of Android Skills enables AI to assist in writing Perfetto SQL queries and interpreting performance data, significantly lowering the barrier to performance optimization.
App Functions: Coverage of more general app development scenarios

Using Skills through Android CLI

The value of these Skills lies in bridging the knowledge gap between general-purpose LLMs and specialized Android development. While general-purpose large models have powerful coding abilities, they often have blind spots regarding Android-specific API usage, best practices, and architectural patterns. Android Skills inject structured domain knowledge, enabling any LLM to become a more competent Android development assistant. Developers can directly invoke and experience these Skills through the Android CLI.

Android Bench Adds New Model Evaluations: Establishing AI Capability Standards

Android Bench is Google's LLM evaluation leaderboard specifically designed to test various models against real Android development challenges, with the core goal of driving continuous improvement of models in Android development scenarios.

AI model evaluation benchmarks play a critical role in driving technological progress. Similar to how HumanEval evaluates general programming ability and SWE-bench evaluates real software engineering tasks, Android Bench focuses on Android-specific scenario evaluation. Its test cases come from real Android development challenges, including but not limited to: correctly using Android lifecycle APIs, handling configuration changes, implementing Material Design specifications, writing Compose UI, and handling permission requests. Vertical domain benchmarks like this expose the shortcomings of general models in specific domains and provide clear optimization directions for model training.

In this update, Android Bench responded to community requests with two important changes:

Addition of Open-Source Model Evaluations

The developer community has long requested that Google evaluate open-source model performance. This update officially includes more commonly used open-source models, including Google's own Gemma 4. Gemma is Google's open-source large language model series, built on the same research and technology as Gemini but released with smaller parameter sizes and open weights. Gemma 4, as the latest version, shows significant improvements in code generation and understanding. For enterprise developers, the value of open-source models lies in: local deployment for code privacy protection, fine-tuning on internal codebases, and independence from external API availability and pricing. This is an important reference for developers focused on the open-source ecosystem who want to deploy AI development assistants locally.

Inclusion of Latest Commercial Models

The leaderboard has been simultaneously updated with evaluation results for the latest commercial models, including Gemini 3.5 Flash and others. Developers can intuitively compare different models' actual performance on Android development tasks through the leaderboard, enabling more informed tool selection.

Android Bench adds Gemma 4 and other open-source models

The significance of Android Bench goes beyond providing a ranking—it establishes an AI capability standard for the Android development domain. As more models are included in the evaluation, model providers will have stronger motivation to optimize for Android scenarios, ultimately benefiting the entire developer community.

Summary: The Android Development Ecosystem in the Agentic Development Era

Looking at these three announcements together, Google's strategic intent is crystal clear: building an open, model-agnostic Android AI development ecosystem.

Android CLI provides the infrastructure layer, allowing any Agent to tap into Android Studio's capabilities
Android Skills provides the knowledge layer, enabling any LLM to possess Android expertise
Android Bench provides the evaluation layer, driving continuous evolution of the entire ecosystem

Google explicitly stated in the announcement: "In the Agentic Development era, we will continue to help you build Android apps using any AI tool, Agent, and LLM." This open stance means that whether you're using Gemini, Claude, GPT, or open-source models, Google is working to ensure a consistent, high-quality Android development experience.

This strategy stands in stark contrast to Apple—Apple tends to deeply bind AI capabilities within its own Xcode and Swift ecosystem, while Google has chosen a platform-neutral approach, opening Android development capabilities as infrastructure to all AI tools. This differentiated strategy may attract more third-party AI tool vendors to prioritize Android development scenarios, creating a positive feedback loop.

For Android developers, now is the best time to embrace AI-assisted development. We recommend that developers familiarize themselves with the Android CLI's new capabilities as soon as possible, try integrating it into existing development workflows, and follow the Android Bench leaderboard to select the AI model best suited to their scenarios.

Deep Dive into Google I/O 2025's Three Major Android Developer Productivity Updates

Android CLI Stable Release: Giving AI Agents Access to IDE Capabilities

Android Skills Continues to Expand: Bridging the Knowledge Gap Between LLMs and Android Development

Android Bench Adds New Model Evaluations: Establishing AI Capability Standards

Addition of Open-Source Model Evaluations

Inclusion of Latest Commercial Models

Summary: The Android Development Ecosystem in the Agentic Development Era

Related articles

Building Cloud Computing Clusters from Old Phones: Google and UCSD Explore a New Path to Sustainable Computing

Jeff Dean's Commencement Speech at UW Allen School: A Message to the Next Generation of Engineers in the AI Era

Codex VS Claude Code: The Token Economics Behind a 10x Price Gap