803 related articles
Tech FrontiersDeep dive into StepFun AI's Step 3.7 Flash, a 198B sparse MoE vision-language model with 256K context and 3-level reasoning, excelling in multimodal understanding, AI coding, and Agent tool orchestration.
Tech FrontiersLiquid AI releases LFM2.5-8B-A1B, a MoE model with 8B total params but only 1.5B active, matching 6B-class models in tool calling. Supports 128K context, local deployment, multilingual, with SGLang Day-0 support.
Industry InsightsSGLang co-hosts a finance AI inference event with Crusoe AI and Cloudflare, exploring LLM inference deployment in trading, risk management, and compliance — signaling Wall Street's shift to production-grade AI infrastructure.
Industry InsightsAMD Instinct MI355X achieves 5% lower TCO than NVIDIA B200 on DeepSeek-R1 disaggregated inference via SGLang+MoRI full-stack optimization with 1.25x per-GPU throughput.
Tech FrontiersSGLang team hosts an Agent Loops Office Hour exploring inference optimization for agentic loops, covering KV Cache reuse, low-latency multi-turn dialogue, and tool calling techniques.
O3 vs Gemini 2.5 Pro vs Claude 3.7: Re…
Real-world comparison of O3, Gemini 2.5 Pro, and Claude 3.7 coding abilities through snake battles, RL training, solar system simulation, and soccer game tasks.
Deep Comparison of o1, o1 pro, and o3-…
Deep Research comparison of OpenAI o1, o1 pro, and o3-mini-high coding capabilities, covering code quality, optimization, error rates, and debugging with benchmarks and real-world cases.
Real-World Coding Test of 13 Top AI Mo…
Benchmark of 13 top AI models including GPT-4.1, Claude 3.7 Sonnet, and Gemini 2.5 Pro on coding ability, scored across 8 dimensions using the same high-difficulty algorithm problem.
Six Foundational Upgrades to Claude Co…
Anthropic's largest-ever foundational upgrade to Claude Code fixes six critical issues at once—terminal flickering, thinking freezes, cryptic errors, context deadlocks, unstable connections, and session crashes—shifting AI coding competition to the infrastructure layer.
Tech FrontiersOpenAI launches Rosalind Biodefense, offering GPT-Rosalind to government agencies to accelerate pathogen surveillance, vaccine R&D, and pandemic preparedness using AI.
Claude Code with MiniMax M2: Testing a…
Real-world testing of MiniMax M2 as Claude Code's backend model across three projects: framework migration, iOS development, and full-stack MVP — at just 8% of Claude's price.
AI Fully Automated Orchestration in Pr…
Deep analysis of AI fully automated software orchestration: from Claude Code workflows to parallel orchestration strategies, exploring how models like MiniMax M1 drive software production costs toward zero.
AI Programming Spec Sheets: 30 Lines o…
Replace vague prompts with spec sheets—30 lines of config gets AI coding right the first time. Covers the six-element framework, three-tier boundaries, and three iron rules to eliminate rework.
Codex Security Guide: Five Key Princip…
A detailed guide to OpenAI Codex permission management covering workspace setup, three permission modes, approval mechanisms, risk-level management, and five safety mantras for secure AI coding.
Getting Started with Claude Code: 5 Co…
Deep dive into the core differences between Claude Code and regular AI chat tools across 5 dimensions: interaction, context understanding, execution, memory, and tool invocation.
Deep Dive into Three Major LLM Career …
Deep analysis of three core LLM roles—Application Engineer, Development Engineer, and Algorithm Engineer—covering technical requirements, salary thresholds, and career prospects including RAG, fine-tuning, and inference deployment.
MCP Protocol Practical Guide: The Stan…
Deep dive into MCP (Model Context Protocol) principles and practical applications. Learn how LLMs connect to external tools via MCP to become agents, covering Java tech stacks, MCP Server ecosystem, Cherry Studio demos, and A2A protocol comparison.
AI Agent Practical Development: A Comp…
A deep dive into AI Agent core principles and practical development paths, covering perception-decision-execution capabilities, MCP protocol tool integration, and analysis of Manus and AutoGLM.
Gemini 2.5 Pro 0605 Hands-On Compariso…
Hands-on testing of Gemini 2.5 Pro 0605 across coding, reasoning, creative writing, and app development, compared head-to-head with OpenAI o3 and Claude Opus 4.
AI Coding Real-World Test: GPT-5, Gemi…
Real-world test using Cursor IDE: GPT-5, Gemini 2.5 Pro, Kimi K2, and Grok 4 all fail at static web scraping while Claude leads with 126 pages. Deep analysis of why top AI models struggle.