GPT 5.6 Internal Testing Codename Revealed, Google Pays SpaceX $920M Monthly for Computing Power

OpenAI Updates: GPT 5.6 Internal Testing Launched and Account Glitch Compensation

OpenAI has internally begun testing a new GPT 5.6 checkpoint, codenamed "Kindle Alpha," while discontinuing two previous checkpoints. In large model development, a "checkpoint" is a complete parameter snapshot at a specific stage of model training — research teams typically save multiple checkpoints to compare performance and roll back experiments. Using internal codenames is standard practice at major AI labs, facilitating communication while reducing outsiders' ability to infer technical details in case of leaks. Discontinuing old checkpoints usually signals that the new version has comprehensively surpassed its predecessors in overall evaluations.

According to the leaker, the leaked model test screenshots were produced without thinking mode enabled, suggesting that GPT 5.6 may have achieved significant improvements in base reasoning capabilities. Notably, current mainstream reasoning models operate in two modes: base mode and thinking mode (Chain-of-Thought), where the latter consumes more computational resources for explicit step-by-step reasoning to improve accuracy. If GPT 5.6 performs strongly without thinking mode activated, it indicates that the foundation model itself has made a substantial leap in capability rather than relying on additional computation during inference — this represents a more fundamental advancement at the technical level.

On a related note, OpenAI recently mistakenly banned some users' ChatGPT accounts due to a system glitch. The company has been progressively restoring account access and compensating affected subscribers with a one-month subscription extension. This incident also serves as a reminder for users to pay attention to account security and platform stability.

Additionally, OpenAI's Codex tool received feature updates, including a new settings search function, along with improvements to full-screen conversations, message notifications, and other details to further enhance the developer experience.

均可以基于QAT进行优化

谷歌与SpaceX达成云计算合作

助力用户承接体量更大复杂度更高的任务

Google Advancing on Multiple Fronts: Gemma 4 Quantization Optimization and SpaceX Computing Partnership

Gemma 4 Quantization-Aware Training Officially Released

Google has released Gemma 4 Quantization-Aware Training (QAT) checkpoints on Hugging Face. QAT is a technique that simulates low-precision inference during the model training phase. Unlike traditional Post-Training Quantization (PTQ), which directly compresses 32-bit floating-point parameters into low-precision integers, QAT inserts "fake quantization" nodes during training, allowing the model to learn to maintain performance in low-precision environments, thereby significantly reducing precision loss.

All Gemma 4 models across different sizes can be optimized using QAT, with support for custom mobile quantization formats. After optimization, the minimum memory footprint can be reduced to 1GB — a significant milestone for edge deployment and running large models on mobile devices. A 1GB memory footprint means even mid-range smartphones can run large models locally, opening doors for offline AI applications and privacy-sensitive scenarios.

Gemini Omni Model Now Available

Google has opened access to the Gemini Omni model for Plus, Pro, and Ultra subscribers, available on both the website and mobile apps. The Omni model represents a natively multimodal architecture — the model processes text, images, audio, and video simultaneously from the training stage, rather than simply stitching together separate models for different modalities. Compared to earlier pipeline-based approaches, native multimodality better understands cross-modal semantic relationships, such as accurately grasping the correspondence between speech and visuals in a video. Google's decision to limit this model to paid tiers reflects both a commercialization strategy and the fact that multimodal inference requires significantly higher computational costs than pure text interactions.

Google and SpaceX Reach $920M Monthly Computing Power Agreement

According to The Wall Street Journal, Google and SpaceX have reached a major cloud computing partnership. The agreement stipulates that from October 2026 to June 2029, Google will pay SpaceX $920 million per month for computing power.

The deeper context behind this partnership is the severe global AI computing supply-demand imbalance. NVIDIA GPU production capacity is currently constrained, and major cloud providers' GPU clusters are fully booked. While Google has its proprietary TPU chips, they remain insufficient to meet the explosive demand from Gemini model training and inference. Although SpaceX is known for its aerospace business, its Starlink operations have accumulated substantial data center infrastructure capable of providing computing services.

This partnership holds significant strategic value for both parties — Google can supplement its TPU computing capacity shortfall, while SpaceX can leverage this to support its IPO process. At $920 million per month over approximately 26 months, the total contract value approaches $24 billion — close to the annual revenue of some mid-sized cloud computing companies. This fully reflects the enormous gap in current AI computing demand and confirms computing power's strategic position as "the new oil."

Anthropic and Open-Source Ecosystem Developments

Claude Cowork Limited-Time Double Credits Promotion

Anthropic has launched a limited-time Claude Cowork promotion, doubling users' available credits through the following month. This initiative aims to help users tackle larger and more complex tasks, reflecting Anthropic's aggressive user growth strategy.

Riverflow 2.5 Image Generation Model Released

The image generation model Riverflow 2.5 has been officially released, with improvements across all text-to-image and image-to-image capabilities. The model features built-in reasoning editing logic, representing a new trend in image generation: models can not only generate images but also understand the semantic logic of editing instructions. For example, when a user instructs "change the background to nighttime but keep the person's lighting natural," the model needs to reason about which pixels to modify, which to preserve, and how to coordinate lighting — far more complex than simple image inpainting, essentially requiring the model to possess common-sense reasoning about the physical world.

The model supports up to 4K image export, and the Pro version is available for free on OpenRouter, providing creators with another high-quality option.

NVIDIA Nemotron 3 Ultra Now Free to Use

OpenCode announced that NVIDIA's open-source flagship model Nemotron 3 Ultra is now available for free on its platform, with support for long-context processing. However, the end date has not been officially specified, so interested developers are advised to try it out soon.

Summary and Outlook

Looking at today's AI industry developments, competition among leading companies is unfolding simultaneously across multiple dimensions: OpenAI continues pushing GPT series iterations, Google consolidates its ecosystem through computing partnerships and model optimization, and Anthropic competes for users with promotional offers. The battle for computing resources has become one of the core battlegrounds in the AI race, as the scale of the Google-SpaceX partnership clearly demonstrates.