Qwen Core Team Turmoil, OpenAI and Google Release New Models in Rapid Succession | AI Daily

March 5 AI roundup: Qwen team turmoil, new models from OpenAI, Google, MiniMax, and Volcano Engine.
March 5 saw intense AI industry activity: multiple core leaders left Alibaba's Qwen team, reportedly due to conflicts over shifting performance metrics from technical indicators to DAU/MAU, exposing deep tensions between fundamental research and commercialization. Meanwhile, MiniMax released Music 2.5+ with new instrumental creation capabilities, OpenAI launched GPT 5.3 Instant with improved conversational naturalness, Google released the cost-effective Gemini 3.1 Flashlight, and Volcano Engine announced Seedance 2.0 video generation API pricing. The industry shows three clear trends: intensifying talent mobility, accelerating model iteration, and broadening application scenarios.
March 5th was an eventful day in the AI industry: MiniMax released a new version of its music model, multiple core leaders at Alibaba's Qwen departed in succession drawing widespread attention, and both OpenAI and Google launched new models. This article covers the day's key developments and provides in-depth analysis of industry changes.
Qwen Core Team Shakeup: A Dispute Over Performance Metrics?
Lin Junyang, the technical lead of Alibaba's Qwen large language model, announced his departure. Several other core leaders of the Qwen LLM have also left in recent weeks, triggering widespread concern across the industry.

According to insiders, these departures may not have been entirely voluntary. Sources indicate that upper management changed the performance metrics for the foundational R&D team to Daily Active Users (DAU) and Monthly Active Users (MAU) — a classic consumer product evaluation approach that is clearly inappropriate for measuring a foundational model research team. The value of fundamental research typically manifests through technical breakthroughs and long-term accumulation, not short-term user growth metrics.
This controversy reflects a common tension during the AI industry's transition from "technology-driven" to "product-driven" development. In academia and early AI labs, evaluation systems for fundamental research typically revolve around technical indicators such as paper publications, benchmark rankings, and model capability breakthroughs. Internet companies, however, naturally lean toward quantifiable growth metrics. As a tech giant rooted in e-commerce and cloud computing, it's not unusual for Alibaba to transplant consumer product thinking onto its foundational model team, but the side effects are equally significant — foundational model R&D cycles are often measured in years, and short-term user data cannot reflect the true value of technical accumulation. This mismatch easily leads to confused team objectives and ultimately drives core talent away.
Other sources revealed that they were forced to resign due to internal issues. Lin Junyang posted on his social media that he "needs to rest" and noted that "former colleagues can carry on, no problem."

This incident reflects a deep-seated contradiction in today's AI industry: the tension between fundamental research and commercialization. When management is eager to see commercial returns and evaluates R&D teams using consumer-grade metrics, it can lead to the loss of core talent. As a notable detail, Zhipu AI posted job openings on the same day, listing multiple positions from pre-training to post-training based in Beijing or Shanghai — seemingly seizing the opportunity to attract talent.
MiniMax Music 2.5+: AI Music Creation Capabilities Upgraded
MiniMax released MiniMax Music 2.5+, bringing several important upgrades:
- New instrumental music creation capability: No longer limited to song generation, it can now create purely instrumental works without vocals
- Multi-style support: Covers classical, minimalist, electronic, and many other music styles
- Cross-style fusion: Supports creative blending across different styles
- Audio quality improvement: Overall sound quality has noticeably improved
The new version is now available on the MiniMax Audio platform, with the API simultaneously open for access.
AI music generation has evolved through three generations: from rule-based synthesis, to statistical models, to deep learning. Early systems relied on hard-coded music theory rules with extremely limited flexibility. Around 2016, LSTM-based models (such as Google's Magenta project) began generating melodies with some degree of coherence. In the 2020s, the introduction of Diffusion Models and Transformer architectures produced a qualitative leap in audio quality, with products like Suno and Udio emerging in succession. The new instrumental generation capability in MiniMax Music 2.5+ technically requires the model to manage complex multi-voice arrangements guided solely by style descriptions and mood instructions, without lyrical semantic anchors — demanding a higher level of musical structure comprehension.
The addition of instrumental music creation is a noteworthy direction — it means AI music tools are evolving from "assisting song creation" to "full-category music production," formally entering professional application tracks such as film scoring and game OSTs. The market space extends from C-end entertainment to B-end content production, with future applications in film scoring, game sound effects, and background music poised to expand further.
OpenAI Releases GPT 5.3 Instant: More Natural Conversations
OpenAI launched the GPT 5.3 Instant model, with key improvements over GPT 5.2 including:
- Smoother conversations: Reduces the "preachy conversational style" that users criticized in previous versions
- Optimized interaction experience: Overall conversation feels more natural
Additionally, OpenAI stated that GPT 5.4 is also on the way, arriving "sooner than expected."
The rapid version cadence from GPT-5.2 to 5.3 and the upcoming 5.4 signals an industry shift from the "capability arms race" to "experience refinement." Early LLM competition focused on parameter scale and benchmark scores, but as mainstream model capabilities converge, users have become significantly more sensitive to experience details like "whether conversations feel natural" and "whether there's a lecturing tone." The "preachy conversational style" that GPT-5.3 focuses on fixing is essentially a side effect of over-alignment during RLHF (Reinforcement Learning from Human Feedback) training — where models produce redundant disclaimers and didactic tones to satisfy safety scores. Rapidly iterating minor versions to fix experience issues while maintaining market momentum and accumulating real user feedback for the next training round has become a standard operational strategy for leading companies.
OpenAI is accelerating its iteration pace. From 5.2 to 5.3 to the upcoming 5.4, the update frequency has clearly increased, reflecting the intense competition in today's LLM landscape.
Google Gemini 3.1 Flashlight: A Cost-Effective Lightweight Model
Google released its lightweight model Gemini 3.1 Flashlight, positioned as a high-value small model:

- Pricing: $0.25 per million tokens (input), $1.5 per million tokens (output)
- Performance: Slightly outperforms Gemini 2.5 Flash in benchmarks at a lower price
- Speed: Significantly faster output speed
Gemini 3.1 Flashlight's pricing strategy reflects the "high volume, thin margin" platform logic in LLM commercialization. As the basic billing unit for large models, 1 million tokens is roughly equivalent to 750,000 English words or about 1.5 million Chinese characters. For high-frequency API call scenarios (such as customer service bots, content moderation, and code completion), token cost directly determines a product's commercial viability. Lightweight models (Flash/Lite series) typically distill capabilities from flagship models through knowledge distillation, quantization compression, and other techniques, achieving 3-10x inference speed improvements and 60-80% cost reductions while sacrificing some complex reasoning ability.
Within Google's model product line, the Flashlight series has always played a "volume" role. Lower prices combined with better performance give it strong competitiveness in API call scenarios. Its core competitive advantage lies in the "good enough + extremely low cost" combination, making it especially suitable for cost-sensitive small and medium developers and high-concurrency application scenarios that still have quality requirements.
Volcano Engine Seedance 2.0 API Pricing Announced
Volcano Engine announced API pricing for its video generation model Seedance 2.0:
- Text-to-video generation: ¥28 per million tokens
- Video input included: ¥46 per million tokens

However, API access is not yet available — only the pricing plan has been announced.
Seedance 2.0's video generation pricing is far higher than text models, rooted in the exponential growth of computational complexity in video generation. Generating a 5-second, 720P video requires the model to maintain inter-frame consistency along the temporal dimension while simultaneously handling motion trajectories, lighting changes, physical laws, and other multi-dimensional constraints — roughly 100-500x the computational load of equivalent-quality image generation. Current mainstream video generation models (including Sora, Runway Gen-3, Kling, etc.) generally employ spatiotemporal attention mechanisms based on diffusion models, requiring repeated computation of high-dimensional tensors across dozens of denoising steps per inference, consuming GPU memory and compute power intensively. This also explains why video generation APIs commonly adopt a "announce pricing first, delay access" strategy — stress testing infrastructure for large-scale concurrent calls requires thorough preparation. From a pricing perspective, video generation costs remain significantly higher than text and image generation, reflecting the enormous computational overhead of video models.
Summary
The AI industry developments on March 5th reveal several clear trends: First, talent mobility is intensifying — the Qwen team changes and Zhipu AI's hiring form a stark contrast. Second, model iteration is accelerating — both OpenAI and Google are rapidly releasing new versions. Third, application scenarios are broadening — from music to video, AI's creative capabilities continue to expand. For practitioners, finding the balance between commercialization pressure and technical accumulation remains a question that demands ongoing reflection.
Related articles
Tech FrontiersGitHub Agent HQ Launch: AI Coding Tools Enter the Era of Platform Competition
GitHub Universe unveils Agent HQ platform for unified coding agent management, Copilot upgrades with multi-model support. OpenAI completes restructuring, Anthropic tests new model, NVIDIA open-sources AI models.
Tech FrontiersGemini 3.5 Flash Achieves a Massive Leap on the GDPval Benchmark
Google Gemini 3.5 Flash surpasses Gemini 3.1 Pro on the GDPval benchmark. The lightweight Flash model leverages post-training techniques to approach frontier-level performance, redefining the balance between quality and cost.
Tech FrontiersGoogle Gemini Antigravity Weekly Quota Tripled — AI Coding Without Limits
Google Gemini triples Antigravity weekly quotas following a prior daily quota boost. Analyzing the impact on developers and its strategic significance in AI coding.