#Dia

1629 related articles

2026年5月30日·2 min

LFM2.5-8B-A1B: A MoE Model with 1.5B Active Parameters Delivering 4x Its Weight Class Performance

Liquid AI releases LFM2.5-8B-A1B, a MoE model with 8B total params but only 1.5B active, matching 6B-class models in tool calling. Supports 128K context, local deployment, multilingual, with SGLang Day-0 support.

SGLang Enters Finance: How AI Inference Infrastructure Is Reshaping Wall Street

Industry Insights

2026年5月30日·2 min

SGLang Enters Finance: How AI Inference Infrastructure Is Reshaping Wall Street

SGLang co-hosts a finance AI inference event with Crusoe AI and Cloudflare, exploring LLM inference deployment in trading, risk management, and compliance — signaling Wall Street's shift to production-grade AI infrastructure.

AMD MI355X Beats B200: Full-Stack Optimization Breakdown for 5% Lower TCO on DeepSeek-R1 Inference

Industry Insights

2026年5月30日·2 min

AMD MI355X Beats B200: Full-Stack Optimization Breakdown for 5% Lower TCO on DeepSeek-R1 Inference

AMD Instinct MI355X achieves 5% lower TCO than NVIDIA B200 on DeepSeek-R1 disaggregated inference via SGLang+MoRI full-stack optimization with 1.25x per-GPU throughput.

Cloudflare Contributes Critical KV Cache and Mooncake Fixes to SGLang

Tech Frontiers

2026年5月30日·1 min

Cloudflare Contributes Critical KV Cache and Mooncake Fixes to SGLang

Cloudflare contributes decode KV cache offload and Mooncake recovery fixes to SGLang, resolving garbled output under high concurrency for Kimi K2.6 and enabling automatic fault recovery in distributed inference.

SGLang Hosts Agent Loops Office Hour, Focusing on Agentic Loop Architecture Optimization

Tech Frontiers

2026年5月30日·1 min

SGLang Hosts Agent Loops Office Hour, Focusing on Agentic Loop Architecture Optimization

SGLang team hosts an Agent Loops Office Hour exploring inference optimization for agentic loops, covering KV Cache reuse, low-latency multi-turn dialogue, and tool calling techniques.

Product Reviews

O3 vs Gemini 2.5 Pro vs Claude 3.7: Re…

2026年5月30日·3 min

O3 vs Gemini 2.5 Pro vs Claude 3.7: Real-World AI Coding Ability Comparison

Real-world comparison of O3, Gemini 2.5 Pro, and Claude 3.7 coding abilities through snake battles, RL training, solar system simulation, and soccer game tasks.

Product Reviews

Deep Comparison of o1, o1 pro, and o3-…

2026年5月30日·3 min

Deep Comparison of o1, o1 pro, and o3-mini-high Coding Capabilities: A Deep Research Analysis

Deep Research comparison of OpenAI o1, o1 pro, and o3-mini-high coding capabilities, covering code quality, optimization, error rates, and debugging with benchmarks and real-world cases.

Product Reviews

Llama 3.3 70B In-Depth Review: Testing…

2026年5月30日·3 min

Llama 3.3 70B In-Depth Review: Testing the Strongest Open-Source LLM with 13 Questions

Meta releases Llama 3.3 70B open-source model with just 70B parameters rivaling 405B performance. Tested on 13 logic, math, and coding questions, it passed 12 — reshaping the open-source model landscape.

Product Reviews

Real-World Coding Test of 13 Top AI Mo…

2026年5月30日·3 min

Real-World Coding Test of 13 Top AI Models: Who Is the Best Programming Assistant?

Benchmark of 13 top AI models including GPT-4.1, Claude 3.7 Sonnet, and Gemini 2.5 Pro on coding ability, scored across 8 dimensions using the same high-difficulty algorithm problem.

Product Reviews

API Aggregation Proxy Platforms Tested…

2026年5月30日·2 min

API Aggregation Proxy Platforms Tested: One Interface to Call 100+ AI Models

Hands-on testing of an API aggregation proxy platform's model calling capabilities, including GPT-Image2 image generation, cost analysis, and coverage of 100+ models like Claude and Gemini.

Tutorials

Orchestrating AI Agents as State Machi…

2026年5月30日·2 min

Orchestrating AI Agents as State Machines: Stop Being a Human Confirmation Button

Explore the next evolution of AI coding: applying CI/CD engineering practices to orchestrate Agents as state machines with YAML templates, Gates, and Dashboards for autonomous multi-Agent progression.

Industry Insights

Six Foundational Upgrades to Claude Co…

2026年5月30日·3 min

Six Foundational Upgrades to Claude Code: AI Programming Moves from Lab to Industrial Scale

Anthropic's largest-ever foundational upgrade to Claude Code fixes six critical issues at once—terminal flickering, thinking freezes, cryptic errors, context deadlocks, unstable connections, and session crashes—shifting AI coding competition to the infrastructure layer.

Tutorials

BMad-Method: Building an AI Agile Deve…

2026年5月30日·3 min

BMad-Method: Building an AI Agile Development Team with a Multi-Agent Framework

Deep dive into BMad-Method, an open-source multi-agent framework simulating a full agile team—from business analysis to QA—supporting Claude Code, Cursor, and more.

Product Reviews

Augment Remote Agent Hands-On: Running…

2026年5月30日·3 min

Augment Remote Agent Hands-On: Running 10 Cloud AI Agents in Parallel for Programming

Hands-on review of Augment Remote Agent: 10 cloud AI Agents coding in parallel, covering bug fixes, PR generation, documentation, and more with detailed workflows and real-world examples.

Tutorials

Claude Code Source Code Study Guide: E…

2026年5月30日·3 min

Claude Code Source Code Study Guide: Efficiently Mastering Core AI Agent Development Architecture

Learn AI Agent development from Claude Code's 510K lines of source code, covering Agent Loop, context compression, multi-Agent orchestration, and two efficient study methods.

Tutorials

Claude Code Monitor Tool Explained: Ev…

2026年5月30日·2 min

Claude Code Monitor Tool Explained: Event-Driven Replaces Polling, Saving Tokens More Efficiently

Deep dive into Claude Code's new built-in Monitor tool. Learn how event-driven monitoring replaces polling via Stream Filter and Poll and Diff modes, dramatically reducing token consumption.

Tutorials

Low-Cost Solution for Using GPT Models…

2026年5月30日·3 min

Low-Cost Solution for Using GPT Models with Claude Code: Build an AI Programming Workflow for ~$1.50/Month

How to use ClipRoxyAPI local proxy to combine Claude Code's programming UX with GPT Codex Team models for under $1.50/month with ample quota and full privacy.

Product Reviews

AI Tool Rankings for Solo Businesses: …

2026年5月30日·2 min

AI Tool Rankings for Solo Businesses: Top Picks, Alternatives & Open-Source Options Across 7 Categories

A complete AI tool matrix for solo businesses across 7 categories—Text, Image, Video, Audio, Digital Avatars, Coding & Agents—with top picks, alternatives, and open-source options.

Product Reviews

Major Claude Code Update: A Complete G…

2026年5月30日·2 min

Major Claude Code Update: A Complete Guide to Agent View and the Goal System

Deep dive into Claude Code's new Agent View and Goal system, covering multi-agent parallel management, background sessions, and result-oriented autonomous execution.

Tutorials

Spring AI Agent Utils: A Java Agent To…

2026年5月30日·3 min

Spring AI Agent Utils: A Java Agent Toolkit Reverse-Engineered from Claude Code's Core Features

Deep dive into Spring AI Agent Utils toolkit covering Skill modules, Ask a User Question, To Do Write, Auto Memory, and multi-Agent orchestration — empowering Java developers to build powerful AI Agents.