35 related articles
AI Weekly: Claude Code Review, Gemma 4…
Weekly AI roundup: Anthropic launches Claude Code review, Google Gemma 4 leaks with MoE architecture, DeepSeek V4 delayed again, Microsoft Copilot Cowork reshapes collaboration, and OpenAI acquires PromptFool.
Industry InsightsSGLang co-hosts a finance AI inference event with Crusoe AI and Cloudflare, exploring LLM inference deployment in trading, risk management, and compliance — signaling Wall Street's shift to production-grade AI infrastructure.
Industry InsightsAMD Instinct MI355X achieves 5% lower TCO than NVIDIA B200 on DeepSeek-R1 disaggregated inference via SGLang+MoRI full-stack optimization with 1.25x per-GPU throughput.
Tech FrontiersCloudflare contributes decode KV cache offload and Mooncake recovery fixes to SGLang, resolving garbled output under high concurrency for Kimi K2.6 and enabling automatic fault recovery in distributed inference.
Llama 3.3 70B In-Depth Review: Testing…
Meta releases Llama 3.3 70B open-source model with just 70B parameters rivaling 405B performance. Tested on 13 logic, math, and coding questions, it passed 12 — reshaping the open-source model landscape.
Tech FrontiersMeta Superintelligence Labs releases Muse Spark, a native multimodal reasoning model supporting visual chain of thought, tool-use, and multi-agent orchestration. Deep dive into its capabilities and competitive positioning.
Product ReviewsTasi Harness is a locally deployed AI Agent browser automation tool that drives browsers via natural language to complete searches, data collection, and form filling. A deep dive into its features, technical highlights, and use cases.
TutorialsLearn how to redirect Claude Agent SDK API requests to local LLMs via LiteLLM Proxy, achieving zero-cost inference while retaining full agent framework capabilities.
Deep DivesA deep dive into AI Agent development methodology, from the ReAct theoretical framework to a four-layer enterprise tech stack covering model services, Agent types, LangChain, and production deployment.
TutorialsIn-depth comparison of two enterprise multi-agent development approaches: low-code platforms like Dify vs. hand-written code with LangGraph. Covers efficiency, flexibility, security, and prompt injection defense strategies.
Complete Guide to Local LLM Deployment…
Complete guide to deploying open-source LLMs locally with Ollama. Covers installation, model selection, VRAM requirements, and performance comparison of Llama 3 and Qwen models. Free, offline-capable AI.
UE5.7 AI Assistant Plugin Hands-On: Wh…
Hands-on test of UE5.7's built-in AI Assistant plugin: how to enable it, knowledge Q&A and C++ code generation results, plus key limitations like no project file access and no Blueprint support.
TutorialsComplete guide to building AI agents on Coze from scratch, covering LLM configuration, prompt writing, plugin integration, knowledge base setup, and memory systems.
Deep DivesDeep dive into Tencent's open-source LLM knowledge platform WeKnora, covering RAG, autonomous reasoning Agent, and self-maintaining Wiki capabilities, plus its Go-based architecture and enterprise use cases.
TutorialsComplete guide to building AI Agents on Dify with zero code, covering tool integration, ESA search configuration, time awareness solutions, and Agent design best practices.