#MoE

90 related articles

2026年5月30日·2 min

LFM2.5-8B-A1B: A MoE Model with 1.5B Active Parameters Delivering 4x Its Weight Class Performance

Liquid AI releases LFM2.5-8B-A1B, a MoE model with 8B total params but only 1.5B active, matching 6B-class models in tool calling. Supports 128K context, local deployment, multilingual, with SGLang Day-0 support.

AMD MI355X Beats B200: Full-Stack Optimization Breakdown for 5% Lower TCO on DeepSeek-R1 Inference

Industry Insights

2026年5月30日·2 min

AMD MI355X Beats B200: Full-Stack Optimization Breakdown for 5% Lower TCO on DeepSeek-R1 Inference

AMD Instinct MI355X achieves 5% lower TCO than NVIDIA B200 on DeepSeek-R1 disaggregated inference via SGLang+MoRI full-stack optimization with 1.25x per-GPU throughput.

Cloudflare Contributes Critical KV Cache and Mooncake Fixes to SGLang

Tech Frontiers

2026年5月30日·1 min

Cloudflare Contributes Critical KV Cache and Mooncake Fixes to SGLang

Cloudflare contributes decode KV cache offload and Mooncake recovery fixes to SGLang, resolving garbled output under high concurrency for Kimi K2.6 and enabling automatic fault recovery in distributed inference.

Industry Insights

AI Fully Automated Orchestration in Pr…

2026年5月29日·3 min

AI Fully Automated Orchestration in Practice: How Software Production Costs Are Being Completely Disrupted

Deep analysis of AI fully automated software orchestration: from Claude Code workflows to parallel orchestration strategies, exploring how models like MiniMax M1 drive software production costs toward zero.

Industry Insights

Deep Dive into Three Major LLM Career …

2026年5月29日·3 min

Deep Dive into Three Major LLM Career Paths: Requirements, Tech Stacks, and Career Prospects

Deep analysis of three core LLM roles—Application Engineer, Development Engineer, and Algorithm Engineer—covering technical requirements, salary thresholds, and career prospects including RAG, fine-tuning, and inference deployment.

Tutorials

DeepSeek V3 + bolt.html: A Practical G…

2026年5月29日·2 min

DeepSeek V3 + bolt.html: A Practical Guide to Generating Beautiful Web Pages with Zero Code

Learn how DeepSeek V3-0324 and open-source tool bolt.html combine to generate beautiful HTML pages with zero code using prompt engineering techniques.

Tutorials

Why Qwen3 Is the Best Open-Source Mode…

2026年5月28日·2 min

Why Qwen3 Is the Best Open-Source Model for MCP Agent Development

Analysis of Qwen3's advantages for MCP agent development, comparing DeepSeek R1's lack of Function Calling, covering MoE architecture and thinking mode switching.

Tech Frontiers

June AI Showdown: Mythos, Sonnet 4.8, …

2026年5月28日·3 min

June AI Showdown: Mythos, Sonnet 4.8, and GPT-5.6 All Revealed

June 2025 becomes AI's densest release month: Anthropic Mythos nears launch, Claude Sonnet/Opus 4.8 skip-level upgrades, GPT-5.6 rapid iteration, DeepSeek V4 Pro permanent 75% price cut.

Tutorials

Complete Guide to Connecting Codex wit…

2026年5月28日·2 min

Complete Guide to Connecting Codex with DeepSeek: Stable Setup in China Without a VPN

Step-by-step guide to deploying OpenAI Codex CLI in China using WSL + MoBridge relay + DeepSeek API. No VPN needed, stable and cost-effective setup in minutes.

DeepSeek V4-Pro Permanent Price Cut: Lower Developer Costs as LLM Price War Heats Up

Tech Frontiers

2026年5月28日·1 min

DeepSeek V4-Pro Permanent Price Cut: Lower Developer Costs as LLM Price War Heats Up

DeepSeek announces permanent discount pricing for its V4-Pro model. Learn how this impacts developers, V4-Pro's competitive edge, and the latest LLM price war trends.

Xiaomi MIMO Free 200M Token Hands-On: A Budget Alternative Amid AI Coding Tool Price Hikes

Product Reviews

2026年5月28日·2 min

Xiaomi MIMO Free 200M Token Hands-On: A Budget Alternative Amid AI Coding Tool Price Hikes

Hands-on review of Xiaomi MIMO 2.5's free 200M Token offer. Covers the application process, coding performance vs Copilot and DeepSeek V4, usage limitations, and who should try this free AI coding tool.

How to Choose an AI Coding Plan? Comparing Cursor, ChatGPT, GLM, and Other Top Options

Product Reviews

2026年5月28日·2 min

How to Choose an AI Coding Plan? Comparing Cursor, ChatGPT, GLM, and Other Top Options

Compare top AI coding plans including Cursor Max, ChatGPT Pro, GLM Coding Plan, and DeepSeek API — with pricing, performance, and use-case recommendations to help you choose.

Three AI Coding Tool Alternatives After Cursor Restrictions

Product Reviews

2026年5月28日·2 min

Three AI Coding Tool Alternatives After Cursor Restrictions

Cursor restricted in China? This article reviews three AI coding alternatives: Augment Code for smart prompt optimization, Trae for best value, and Amazon Kilo for process-driven development.

AI Agent Development Methodology: A Complete Guide from ReAct to Enterprise-Grade Tech Stack

Deep Dives

2026年5月28日·2 min

AI Agent Development Methodology: A Complete Guide from ReAct to Enterprise-Grade Tech Stack

A deep dive into AI Agent development methodology, from the ReAct theoretical framework to a four-layer enterprise tech stack covering model services, Agent types, LangChain, and production deployment.

AI Weekly: Kimi K2.6 Tops Open-Source Rankings, Qwen 3.6 and Google TTS Launch Together

Tech Frontiers

2026年5月27日·2 min

AI Weekly: Kimi K2.6 Tops Open-Source Rankings, Qwen 3.6 and Google TTS Launch Together

Weekly AI roundup: Kimi K2.6 tops open-source rankings, Anthropic launches Opus 4.7 and Claude Design, Alibaba rolls out Qwen 3.6 series, Google releases emotion-controllable TTS model.

GLM5 Architecture Leaked: 745B Parameters, DeepSeek V4 May Launch Quantized Smaller Model First

Tech Frontiers

2026年5月27日·2 min

GLM5 Architecture Leaked: 745B Parameters, DeepSeek V4 May Launch Quantized Smaller Model First

GLM5 code leak reveals 745B-parameter MoE architecture replicating DeepSeek V3. DeepSeek V4 may launch a 200B quantized model first, with flagship exceeding 1T parameters.

Qwen Launches 400+ New Features as Wenxin 5.0 and Multiple LLMs Drop Simultaneously

Tech Frontiers

2026年5月27日·2 min

Qwen Launches 400+ New Features as Wenxin 5.0 and Multiple LLMs Drop Simultaneously

Alibaba's Qwen APP launches 400+ features integrating Alipay and Taobao, Baidu releases ERNIE 5.0, Meituan unveils deep reasoning model, StepFun tops global speech AI rankings, and Anthropic's share nears Google's.

GPT-5.2 Released: The Truth and Concerns Behind Its 390x Efficiency Gain

Tech Frontiers

2026年5月27日·2 min

GPT-5.2 Released: The Truth and Concerns Behind Its 390x Efficiency Gain

OpenAI releases GPT-5.2 with a 390x efficiency gain on ARC-AGI, beating Claude Opus 4.5. Deep analysis of the efficiency leap, user experience paradox, Disney's $1B deal, and the AI content quality crisis.

Product Reviews

Qwen 3.6 vs Gemma 4: In-Depth Comparis…

2026年5月27日·3 min

Qwen 3.6 vs Gemma 4: In-Depth Comparison of Local AI Coding Models Through Real-World Development

Real-world comparison of Qwen 3.6 and Gemma 4 local AI models building a Markdown editor with Tauri, testing planning ability, code generation, and development efficiency.

Kimi K2.6 Open-Source Hands-On: How Strong Is Its Orchestration of 300 Concurrent Agents?

Product Reviews

2026年5月27日·2 min

Kimi K2.6 Open-Source Hands-On: How Strong Is Its Orchestration of 300 Concurrent Agents?

Deep analysis of Moonshot AI's open-source Kimi K2.6 Agent orchestration: 300 sub-Agents executing 4000-step tasks, outperforming GPT-5.4 in coding benchmarks, LoRA fine-tuning on 2x RTX 4090s.