#Mixture of Experts

85 related articles

2026年6月3日·3 min

Google Gemma 4 Hands-On Review: Offline on Smartphones + Ollama Deployment Tutorial

Hands-on testing of Google Gemma 4 open-source models running offline on three phones, with Dense vs MOE architecture explained and a complete Ollama + Claude Code deployment tutorial.

Essential Skills for LLM Engineers: A Complete Guide to Application Development and Fine-Tuning

Tutorials

2026年6月2日·1 min

Essential Skills for LLM Engineers: A Complete Guide to Application Development and Fine-Tuning

A systematic guide to LLM engineer core skills covering RAG, Agent app development and SFT, RLHF fine-tuning, with clear learning paths for different backgrounds.

Windsurf Rebrands as Devin Desktop: A Complete Breakdown of the Multi-Agent Collaborative IDE Platform

Tech Frontiers

2026年6月2日·3 min

Windsurf Rebrands as Devin Desktop: A Complete Breakdown of the Multi-Agent Collaborative IDE Platform

Windsurf rebrands as Devin Desktop with Agent Command Center for multi-agent management, open-source ACP protocol, and a Rust-rewritten local Agent. Full breakdown of the upgrade and platform strategy.

Connect Claude Code to DeepSeek: Zero-Barrier Four-Step Configuration Tutorial

Tutorials

2026年6月2日·2 min

Connect Claude Code to DeepSeek: Zero-Barrier Four-Step Configuration Tutorial

Step-by-step tutorial on connecting Claude Code to DeepSeek using ccswitch. No overseas account or credit card needed — just 10 RMB to start using an AI coding assistant.

llama.cpp MTP Acceleration Deployment Guide: Configuration Steps & Real-World Benchmarks

Tutorials

2026年6月2日·3 min

llama.cpp MTP Acceleration Deployment Guide: Configuration Steps & Real-World Benchmarks

Guide to enabling MTP multi-Token prediction acceleration in llama.cpp, covering CUDA setup, desktop configuration, model selection, and benchmarks showing ~60 Token/s with Qwen3 27B.

Complete Guide to Connecting Codex with DeepSeek V4: API Configuration & Plugin Unlock

Tutorials

2026年6月2日·3 min

Complete Guide to Connecting Codex with DeepSeek V4: API Configuration & Plugin Unlock

Complete guide to connecting Codex with DeepSeek V4 via CC Switch relay, including API Key setup, channel configuration, and plugin unlock steps for cost-effective AI programming.

Integrating AI Models with Zotero 9: Complete Configuration Guide for Doubao and DeepSeek

Tutorials

2026年6月2日·3 min

Integrating AI Models with Zotero 9: Complete Configuration Guide for Doubao and DeepSeek

Complete guide to integrating Doubao and DeepSeek AI models with Zotero 9 for paper summarization, full-text translation, and intelligent Q&A.

Tutorial: Building a Low-Cost AI Code Editor with DeepSeek-V3 + VSCode

Tutorials

2026年6月2日·2 min

Tutorial: Building a Low-Cost AI Code Editor with DeepSeek-V3 + VSCode

Step-by-step tutorial: Build a low-cost AI programming assistant using DeepSeek-V3 API with VSCode's Continue plugin. Covers setup, API Key configuration, code completion demo, and Ollama local deployment.

Hermes Agent Deployment Tutorial: An AI Assistant That Uses Fewer Tokens Than CrawlAI

Tutorials

2026年6月2日·3 min

Hermes Agent Deployment Tutorial: An AI Assistant That Uses Fewer Tokens Than CrawlAI

Complete Hermes Agent deployment tutorial for Windows: environment setup, model configuration, WeChat channel connection, and troubleshooting. Uses fewer tokens than CrawlAI with direct WeChat chat support.

Hermes Orchestrating DeepSeek + MiniMax Dual-AI Collaborative Coding: A From-Scratch Project Test

Tutorials

2026年6月2日·3 min

Hermes Orchestrating DeepSeek + MiniMax Dual-AI Collaborative Coding: A From-Scratch Project Test

Testing Hermes agent coordinating DeepSeek V4 and MiniMax 2.7 for collaborative coding: PDF export in 9 minutes, RSS service built from scratch in Nim language.

Free Unlimited DeepSeek Full Version? Deep Dive into AI Aggregation Platforms & Risk Analysis

Product Reviews

2026年6月2日·2 min

Free Unlimited DeepSeek Full Version? Deep Dive into AI Aggregation Platforms & Risk Analysis

In-depth analysis of AI aggregation platforms claiming free unlimited DeepSeek R1 full version access, revealing data security risks and sustainability concerns, with reliable alternatives.

Product Reviews

Windsurf Wave 3 Deep Dive: MCP Support…

2026年6月2日·3 min

Windsurf Wave 3 Deep Dive: MCP Support, Turbo Mode & Multi-Model Integration

Deep dive into Windsurf Wave 2 & Wave 3 updates: MCP protocol support, Turbo auto mode, DeepSeek integration, Tab to Jump, pricing comparison with Cursor.

OpenHuman Deep Dive: A Context-First Open-Source Personal AI Agent

Product Reviews

2026年6月2日·4 min

OpenHuman Deep Dive: A Context-First Open-Source Personal AI Agent

Deep dive into OpenHuman open-source AI Agent: context-first architecture, Rust+React hybrid, Memory Tree system, Token Juice compression, and multi-model routing.

oMLX + MTP + Qwen3.6: Local AI Coding Speed Breaks New Records

Tutorials

2026年6月1日·3 min

oMLX + MTP + Qwen3.6: Local AI Coding Speed Breaks New Records

Using oMLX with MTP and Qwen3.6 35B on Apple Silicon Mac to achieve 86.7 tokens/s local coding speed, building a full-stack app in under 5 minutes.

OpenRouter Free Models Tutorial: Accessing 28 Free AI Models & Deep Dive into the AI Market Landscape

Tutorials

2026年6月1日·3 min

OpenRouter Free Models Tutorial: Accessing 28 Free AI Models & Deep Dive into the AI Market Landscape

Guide to OpenRouter's 28 free AI models with API setup, covering GPT-OSS 120B, DeepSeek V4 Flash, and leaderboard insights into the AI model market landscape.

Tech Frontiers

AI Weekly: Claude Code Review, Gemma 4…

2026年6月1日·3 min

AI Weekly: Claude Code Review, Gemma 4 Leak & DeepSeek V4 Delayed

Weekly AI roundup: Anthropic launches Claude Code review, Google Gemma 4 leaks with MoE architecture, DeepSeek V4 delayed again, Microsoft Copilot Cowork reshapes collaboration, and OpenAI acquires PromptFool.

SGLang v0.5.12.post1 Released: DeepSeek V4 Stability Fixes and Blackwell Adaptation

Tech Frontiers

2026年5月30日·2 min

SGLang v0.5.12.post1 Released: DeepSeek V4 Stability Fixes and Blackwell Adaptation

SGLang v0.5.12.post1 stability patch details: 12 critical fixes covering DeepSeek V4 garbled text and crashes, NIXL PD disaggregated inference logic, Blackwell B300 adaptation, and cold start optimization.

Step 3.7 Flash: Deep Dive into the 198B Sparse MoE Multimodal Model

Tech Frontiers

2026年5月30日·2 min

Step 3.7 Flash: Deep Dive into the 198B Sparse MoE Multimodal Model

Deep dive into StepFun AI's Step 3.7 Flash, a 198B sparse MoE vision-language model with 256K context and 3-level reasoning, excelling in multimodal understanding, AI coding, and Agent tool orchestration.

LFM2.5-8B-A1B: A MoE Model with 1.5B Active Parameters Delivering 4x Its Weight Class Performance

Tech Frontiers

2026年5月30日·2 min

LFM2.5-8B-A1B: A MoE Model with 1.5B Active Parameters Delivering 4x Its Weight Class Performance

Liquid AI releases LFM2.5-8B-A1B, a MoE model with 8B total params but only 1.5B active, matching 6B-class models in tool calling. Supports 128K context, local deployment, multilingual, with SGLang Day-0 support.

AMD MI355X Beats B200: Full-Stack Optimization Breakdown for 5% Lower TCO on DeepSeek-R1 Inference

Industry Insights

2026年5月30日·2 min

AMD MI355X Beats B200: Full-Stack Optimization Breakdown for 5% Lower TCO on DeepSeek-R1 Inference

AMD Instinct MI355X achieves 5% lower TCO than NVIDIA B200 on DeepSeek-R1 disaggregated inference via SGLang+MoRI full-stack optimization with 1.25x per-GPU throughput.