#open-source LLM

35 related articles

AI Weekly: Claude Code Review, Gemma 4…

2026年6月1日·3 min

AI Weekly: Claude Code Review, Gemma 4 Leak & DeepSeek V4 Delayed

Weekly AI roundup: Anthropic launches Claude Code review, Google Gemma 4 leaks with MoE architecture, DeepSeek V4 delayed again, Microsoft Copilot Cowork reshapes collaboration, and OpenAI acquires PromptFool.

SGLang Enters Finance: How AI Inference Infrastructure Is Reshaping Wall Street

Industry Insights

2026年5月30日·2 min

SGLang Enters Finance: How AI Inference Infrastructure Is Reshaping Wall Street

SGLang co-hosts a finance AI inference event with Crusoe AI and Cloudflare, exploring LLM inference deployment in trading, risk management, and compliance — signaling Wall Street's shift to production-grade AI infrastructure.

AMD MI355X Beats B200: Full-Stack Optimization Breakdown for 5% Lower TCO on DeepSeek-R1 Inference

Industry Insights

2026年5月30日·2 min

AMD MI355X Beats B200: Full-Stack Optimization Breakdown for 5% Lower TCO on DeepSeek-R1 Inference

AMD Instinct MI355X achieves 5% lower TCO than NVIDIA B200 on DeepSeek-R1 disaggregated inference via SGLang+MoRI full-stack optimization with 1.25x per-GPU throughput.

Cloudflare Contributes Critical KV Cache and Mooncake Fixes to SGLang

Tech Frontiers

2026年5月30日·1 min

Cloudflare Contributes Critical KV Cache and Mooncake Fixes to SGLang

Cloudflare contributes decode KV cache offload and Mooncake recovery fixes to SGLang, resolving garbled output under high concurrency for Kimi K2.6 and enabling automatic fault recovery in distributed inference.

Product Reviews

Llama 3.3 70B In-Depth Review: Testing…

2026年5月30日·3 min

Llama 3.3 70B In-Depth Review: Testing the Strongest Open-Source LLM with 13 Questions

Meta releases Llama 3.3 70B open-source model with just 70B parameters rivaling 405B performance. Tested on 13 logic, math, and coding questions, it passed 12 — reshaping the open-source model landscape.

Meta Muse Spark Released: A Comprehensive Analysis of the Native Multimodal Reasoning Model

Tech Frontiers

2026年5月28日·2 min

Meta Muse Spark Released: A Comprehensive Analysis of the Native Multimodal Reasoning Model

Meta Superintelligence Labs releases Muse Spark, a native multimodal reasoning model supporting visual chain of thought, tool-use, and multi-agent orchestration. Deep dive into its capabilities and competitive positioning.

Tasi Harness: Local AI Agent for Browser Automation

Product Reviews

2026年5月28日·1 min

Tasi Harness: Local AI Agent for Browser Automation

Tasi Harness is a locally deployed AI Agent browser automation tool that drives browsers via natural language to complete searches, data collection, and form filling. A deep dive into its features, technical highlights, and use cases.

Claude Agent SDK + LiteLLM + Local LLMs: Building a Zero-Cost AI Agent Platform

Tutorials

2026年5月28日·3 min

Claude Agent SDK + LiteLLM + Local LLMs: Building a Zero-Cost AI Agent Platform

Learn how to redirect Claude Agent SDK API requests to local LLMs via LiteLLM Proxy, achieving zero-cost inference while retaining full agent framework capabilities.

AI Agent Development Methodology: A Complete Guide from ReAct to Enterprise-Grade Tech Stack

Deep Dives

2026年5月28日·2 min

AI Agent Development Methodology: A Complete Guide from ReAct to Enterprise-Grade Tech Stack

A deep dive into AI Agent development methodology, from the ReAct theoretical framework to a four-layer enterprise tech stack covering model services, Agent types, LangChain, and production deployment.

Enterprise Multi-Agent Development: Low-Code Platforms vs. Hand-Written Code — An In-Depth Comparison

Tutorials

2026年5月27日·3 min

Enterprise Multi-Agent Development: Low-Code Platforms vs. Hand-Written Code — An In-Depth Comparison

In-depth comparison of two enterprise multi-agent development approaches: low-code platforms like Dify vs. hand-written code with LangGraph. Covers efficiency, flexibility, security, and prompt injection defense strategies.

Tutorials

Complete Guide to Local LLM Deployment…

2026年5月27日·2 min

Complete Guide to Local LLM Deployment with Ollama: AI That Works Offline

Complete guide to deploying open-source LLMs locally with Ollama. Covers installation, model selection, VRAM requirements, and performance comparison of Llama 3 and Qwen models. Free, offline-capable AI.

Product Reviews

UE5.7 AI Assistant Plugin Hands-On: Wh…

2026年5月27日·1 min

UE5.7 AI Assistant Plugin Hands-On: What Can It Do? What Can't It Do?

Hands-on test of UE5.7's built-in AI Assistant plugin: how to enable it, knowledge Q&A and C++ code generation results, plus key limitations like no project file access and no Blueprint support.

Coze AI Agent Tutorial: A Complete Guide to Building AI Agents from Scratch

Tutorials

2026年5月13日·4 min

Coze AI Agent Tutorial: A Complete Guide to Building AI Agents from Scratch

Complete guide to building AI agents on Coze from scratch, covering LLM configuration, prompt writing, plugin integration, knowledge base setup, and memory systems.

Deep Dive into Tencent's Open-Source WeKnora: An All-in-One Knowledge Platform with RAG + Agent + Wiki

Deep Dives

2026年5月13日·3 min

Deep Dive into Tencent's Open-Source WeKnora: An All-in-One Knowledge Platform with RAG + Agent + Wiki

Deep dive into Tencent's open-source LLM knowledge platform WeKnora, covering RAG, autonomous reasoning Agent, and self-maintaining Wiki capabilities, plus its Go-based architecture and enterprise use cases.

Dify AI Agent Tutorial: Tool Integration & ESA Search Configuration in Practice

Tutorials

2026年5月13日·4 min

Dify AI Agent Tutorial: Tool Integration & ESA Search Configuration in Practice

Complete guide to building AI Agents on Dify with zero code, covering tool integration, ESA search configuration, time awareness solutions, and Agent design best practices.