24 related articles
O3 vs Gemini 2.5 Pro vs Claude 3.7: Re…
Real-world comparison of O3, Gemini 2.5 Pro, and Claude 3.7 coding abilities through snake battles, RL training, solar system simulation, and soccer game tasks.
ResearchMeta reveals Muse Spark technical details: three-dimensional scaling across pre-training, RL, and test-time inference achieves over 10x compute reduction versus Llama 4 Maverick.
Tech FrontiersGLM5 code leak reveals 745B-parameter MoE architecture replicating DeepSeek V3. DeepSeek V4 may launch a 200B quantized model first, with flagship exceeding 1T parameters.
Tech FrontiersAnthropic adds custom sub-agents to Claude Code, Cursor launches code review Agent BugBot, Qwen releases 92-language translation model, and Google unveils three experimental AI products.