1629 related articles
Tech FrontiersLiquid AI releases LFM2.5-8B-A1B, a MoE model with 8B total params but only 1.5B active, matching 6B-class models in tool calling. Supports 128K context, local deployment, multilingual, with SGLang Day-0 support.
Industry InsightsSGLang co-hosts a finance AI inference event with Crusoe AI and Cloudflare, exploring LLM inference deployment in trading, risk management, and compliance — signaling Wall Street's shift to production-grade AI infrastructure.
Industry InsightsAMD Instinct MI355X achieves 5% lower TCO than NVIDIA B200 on DeepSeek-R1 disaggregated inference via SGLang+MoRI full-stack optimization with 1.25x per-GPU throughput.
Tech FrontiersCloudflare contributes decode KV cache offload and Mooncake recovery fixes to SGLang, resolving garbled output under high concurrency for Kimi K2.6 and enabling automatic fault recovery in distributed inference.
Tech FrontiersSGLang team hosts an Agent Loops Office Hour exploring inference optimization for agentic loops, covering KV Cache reuse, low-latency multi-turn dialogue, and tool calling techniques.
O3 vs Gemini 2.5 Pro vs Claude 3.7: Re…
Real-world comparison of O3, Gemini 2.5 Pro, and Claude 3.7 coding abilities through snake battles, RL training, solar system simulation, and soccer game tasks.
Deep Comparison of o1, o1 pro, and o3-…
Deep Research comparison of OpenAI o1, o1 pro, and o3-mini-high coding capabilities, covering code quality, optimization, error rates, and debugging with benchmarks and real-world cases.
Llama 3.3 70B In-Depth Review: Testing…
Meta releases Llama 3.3 70B open-source model with just 70B parameters rivaling 405B performance. Tested on 13 logic, math, and coding questions, it passed 12 — reshaping the open-source model landscape.
Real-World Coding Test of 13 Top AI Mo…
Benchmark of 13 top AI models including GPT-4.1, Claude 3.7 Sonnet, and Gemini 2.5 Pro on coding ability, scored across 8 dimensions using the same high-difficulty algorithm problem.
API Aggregation Proxy Platforms Tested…
Hands-on testing of an API aggregation proxy platform's model calling capabilities, including GPT-Image2 image generation, cost analysis, and coverage of 100+ models like Claude and Gemini.
Orchestrating AI Agents as State Machi…
Explore the next evolution of AI coding: applying CI/CD engineering practices to orchestrate Agents as state machines with YAML templates, Gates, and Dashboards for autonomous multi-Agent progression.
Six Foundational Upgrades to Claude Co…
Anthropic's largest-ever foundational upgrade to Claude Code fixes six critical issues at once—terminal flickering, thinking freezes, cryptic errors, context deadlocks, unstable connections, and session crashes—shifting AI coding competition to the infrastructure layer.
BMad-Method: Building an AI Agile Deve…
Deep dive into BMad-Method, an open-source multi-agent framework simulating a full agile team—from business analysis to QA—supporting Claude Code, Cursor, and more.
Augment Remote Agent Hands-On: Running…
Hands-on review of Augment Remote Agent: 10 cloud AI Agents coding in parallel, covering bug fixes, PR generation, documentation, and more with detailed workflows and real-world examples.
Claude Code Source Code Study Guide: E…
Learn AI Agent development from Claude Code's 510K lines of source code, covering Agent Loop, context compression, multi-Agent orchestration, and two efficient study methods.
Claude Code Monitor Tool Explained: Ev…
Deep dive into Claude Code's new built-in Monitor tool. Learn how event-driven monitoring replaces polling via Stream Filter and Poll and Diff modes, dramatically reducing token consumption.
Low-Cost Solution for Using GPT Models…
How to use ClipRoxyAPI local proxy to combine Claude Code's programming UX with GPT Codex Team models for under $1.50/month with ample quota and full privacy.
AI Tool Rankings for Solo Businesses: …
A complete AI tool matrix for solo businesses across 7 categories—Text, Image, Video, Audio, Digital Avatars, Coding & Agents—with top picks, alternatives, and open-source options.
Major Claude Code Update: A Complete G…
Deep dive into Claude Code's new Agent View and Goal system, covering multi-agent parallel management, background sessions, and result-oriented autonomous execution.
Spring AI Agent Utils: A Java Agent To…
Deep dive into Spring AI Agent Utils toolkit covering Skill modules, Ask a User Question, To Do Write, Auto Memory, and multi-Agent orchestration — empowering Java developers to build powerful AI Agents.