June 2 AI Industry Roundup: NVIDIA's 550B Model, xAI's Coding Model, and More Major Releases

June 2 AI highlights: NVIDIA's 550B model, xAI coding model, dual IPO filings, and agent ecosystem breakthroughs.
On June 2, 2025, the AI industry saw major releases including NVIDIA's 550B-parameter Nimitron 3 Ultra, xAI's Composer 2.5 coding model, and Alibaba's Qwen 3.7-Plus multimodal agent foundation. Anthropic and ZhiPu both advanced IPO plans. OpenAI demoed a mobile agentic OS prototype, while agent infrastructure matured with Perplexity's Search as Code and Tencent's HiMemory plugin. Hardware expansion continued with NVIDIA's RTX Spark and OpenAI's Stargate data center project.
On June 2, 2025, the AI industry saw a flurry of product launches and corporate developments. From NVIDIA releasing a 550B-parameter large model, to multiple companies advancing their IPO plans, to the continued evolution of the AI agent ecosystem — this article covers the most noteworthy technical breakthroughs and industry trends of the day.



Capital Markets: Two Major AI Companies Advance IPOs Simultaneously
On the capital front, two blockbuster announcements grabbed attention. Anthropic has confidentially filed a draft S-1 with the U.S. Securities and Exchange Commission (SEC), signaling plans to go public. However, the offering size and pricing remain undetermined, with timing dependent on the SEC review process and market conditions. As the developer of the Claude model series, Anthropic's IPO journey will serve as a key valuation benchmark for the AI industry.
Meanwhile, Chinese AI unicorn ZhiPu (智谱) is also planning to file for an A-share IPO on the STAR Market. The offering would represent 2% to 8% of total shares, with proceeds earmarked for general-purpose foundation model development, MaaS platform construction, and working capital. The simultaneous IPO pushes by major AI companies in both the U.S. and China reflect the industry's transition from a technology race to a critical stage of capitalization.
Major Model Releases: Three Heavyweights with Distinct Focus Areas
NVIDIA Nimitron 3 Ultra: A 550B-Parameter Performance Benchmark
NVIDIA officially released Nimitron 3 Ultra, boasting 550 billion parameters — making it one of the largest publicly released models to date. The model emphasizes high output speed, with optimizations specifically targeting coding and complex task scenarios. Combined with NVIDIA's deep expertise on the hardware side, this model is poised to set new benchmarks in inference efficiency.
xAI Composer 2.5: Built for Complex Coding Tasks
xAI released its coding model Composer 2.5, integrated into the Grok Build model ecosystem. Designed for long-running tasks and complex instruction scenarios, it's now available to SuperGrok and X Premium+ users. This marks xAI's deeper push into the coding assistant space, putting it in direct competition with mainstream products like GitHub Copilot and Cursor.
Alibaba Qwen 3.7-Plus: A Versatile Multimodal Agent Foundation
Alibaba's Qwen team released Qwen 3.7-Plus, featuring comprehensive multimodal upgrades. The new model is available via the Bailian API, covering image understanding, GUI manipulation, and task execution capabilities. Positioned as a versatile multimodal agent foundation, Qwen is no longer just a language model — it can serve as the core engine driving complex interactive tasks for AI agents.
JetBrains Malum 2: The IDE Giant's AI Core
JetBrains released Malum 2, a 12B Mixture-of-Experts (MoE) model purpose-built for software development scenarios. It supports code generation, code comprehension, and assisted development workflows. As a homegrown model from the leader in IDE tooling, Malum 2's core advantage lies in its deep integration with the JetBrains toolchain.
Agent Ecosystem: From Proof of Concept to Production
Multiple developments on this day indicate that AI agents are rapidly moving from proof-of-concept to real-world applications.
OpenAI showcased an agentic operating system prototype for mobile phones. The prototype doesn't rely on traditional apps — instead, it generates interfaces in real-time via voice and executes operational workflows. This direction could fundamentally redefine human-computer interaction on mobile devices.
Meituan's AI Agent "Xiao Mei" will deeply integrate with Tencent's Yuanbao platform, allowing users to submit food delivery, logistics, and other local lifestyle requests directly within Tencent Yuanbao. This is a landmark case of combining large models with O2O platforms in China, demonstrating the commercial monetization potential of agents in vertical scenarios.
Perplexity launched Search as Code, reimagining the search experience for AI agents — agents can write Python to invoke searches via the Agent API. Search is evolving from "humans actively seeking information" to "agents autonomously acquiring information," which is critical for agents' autonomous decision-making capabilities.
Tencent Meeting released the HiMemory plugin, a long-term collaborative agent memory plugin that can save, organize, and recall key information across multi-turn tasks. Memory has long been a core weakness of AI agents, and HiMemory provides essential infrastructure for long-cycle collaborative tasks.
Hardware & Infrastructure: Computing Power Expansion Continues to Accelerate
At GTC Taipei, NVIDIA announced two significant hardware developments:
- RTX Spark: A local AI agent device for Windows PCs, delivering up to PFLOPS-level AI compute and 128GB of unified memory, significantly lowering the hardware barrier for local agent deployment
- FIFOX Factory Operations Blueprint: Targeting the semiconductor and electronics manufacturing industries, it connects machine signals with quality data to drive large-scale AI adoption in industrial manufacturing
Additionally, OpenAI broke ground on the Stargate data center project in Michigan, with an unprecedented planned scale that will support massive compute demands and serve future AI infrastructure deployments. The pace of computing infrastructure expansion directly determines the upper limits of next-generation model training and inference capabilities.
Open Source & Ecosystem: Democratizing Technology Continues
The open-source space was equally active:
- Apache RocketMQ launched RocketMQ for AI, an AI-specific messaging engine designed for long conversations, multi-agent collaboration, and resource scheduling — now fully open-sourced
- SenseTime released SenseNova U1, specializing in infographic and data chart generation, available for download on Hugging Face
- OpenBMB, in collaboration with Tsinghua NLP and ModelBest, released two open-source datasets covering pre-training and SFT stages, topping the Hugging Face trending charts
- Runway joined the Cosmos Coalition alongside NVIDIA, collaborating on open world models for robotics and autonomous driving scenarios
Safety & Compliance: Growing Governance Pressure
On a cautionary note, a security vulnerability was discovered in Meta AI that allows attackers to exploit the chatbot to bind new email addresses, potentially leading to Instagram account takeovers. Additionally, the state of Florida filed a lawsuit against OpenAI and Sam Altman, focusing on ChatGPT-related risks and corporate safety responsibilities. Security and compliance issues in AI products are becoming hard constraints that the industry can no longer avoid.
Summary & Trend Analysis
The AI industry developments on June 2 reveal several clear trends: model scale continues to grow but with greater emphasis on specialized capability optimization; agents are moving from demo stage to production environments; capital market enthusiasm for the AI sector remains strong; and safety and compliance concerns are pushing the industry toward more mature governance frameworks.
For practitioners, paying attention to the maturity of agent infrastructure — memory systems, search interfaces, messaging engines — may prove more practically valuable than simply chasing parameter counts.
Related articles

Codex VS Claude Code: The Token Economics Behind a 10x Price Gap
Same coding task: Codex costs $15, Claude Code costs $155. Deep dive into the real reasons behind the 10x gap — it's not pricing, it's token volume, output style, and context strategy.

Gemma 4 Open-Source Model Local Deployment Guide: Ollama Installation & Mobile Setup
Step-by-step guide to deploying Google's Gemma 4 open-source model locally with Ollama and running the lightweight version on mobile with tool calling support.

The Decline of Tokenmaxxing: Why Selling Outcomes Matters More Than Selling Tokens
The Tokenmaxxing craze is fading as enterprise AI procurement shifts from chasing Token counts to focusing on actual business outcomes. Learn why outcome-based AI evaluation is the right approach.