35 related articles
Tech FrontiersLiquid AI releases LFM2.5-8B-A1B, a MoE model with 8B total params but only 1.5B active, matching 6B-class models in tool calling. Supports 128K context, local deployment, multilingual, with SGLang Day-0 support.
Industry InsightsAMD Instinct MI355X achieves 5% lower TCO than NVIDIA B200 on DeepSeek-R1 disaggregated inference via SGLang+MoRI full-stack optimization with 1.25x per-GPU throughput.
Tech FrontiersCloudflare contributes decode KV cache offload and Mooncake recovery fixes to SGLang, resolving garbled output under high concurrency for Kimi K2.6 and enabling automatic fault recovery in distributed inference.
Deep Dive into Three Major LLM Career …
Deep analysis of three core LLM roles—Application Engineer, Development Engineer, and Algorithm Engineer—covering technical requirements, salary thresholds, and career prospects including RAG, fine-tuning, and inference deployment.
DeepSeek V3 + bolt.html: A Practical G…
Learn how DeepSeek V3-0324 and open-source tool bolt.html combine to generate beautiful HTML pages with zero code using prompt engineering techniques.
Why Qwen3 Is the Best Open-Source Mode…
Analysis of Qwen3's advantages for MCP agent development, comparing DeepSeek R1's lack of Function Calling, covering MoE architecture and thinking mode switching.
June AI Showdown: Mythos, Sonnet 4.8, …
June 2025 becomes AI's densest release month: Anthropic Mythos nears launch, Claude Sonnet/Opus 4.8 skip-level upgrades, GPT-5.6 rapid iteration, DeepSeek V4 Pro permanent 75% price cut.
Tech FrontiersDeepSeek announces permanent discount pricing for its V4-Pro model. Learn how this impacts developers, V4-Pro's competitive edge, and the latest LLM price war trends.
Deep DivesA deep dive into AI Agent development methodology, from the ReAct theoretical framework to a four-layer enterprise tech stack covering model services, Agent types, LangChain, and production deployment.
Tech FrontiersGLM5 code leak reveals 745B-parameter MoE architecture replicating DeepSeek V3. DeepSeek V4 may launch a 200B quantized model first, with flagship exceeding 1T parameters.
Product ReviewsDeep analysis of Moonshot AI's open-source Kimi K2.6 Agent orchestration: 300 sub-Agents executing 4000-step tasks, outperforming GPT-5.4 in coding benchmarks, LoRA fine-tuning on 2x RTX 4090s.
Product ReviewsIn-depth review of Kimi K2.6's coding, Agent collaboration, and visual development capabilities. #1 open-source on SWE-Bench Pro, 300 parallel sub-agents, API priced at 1/3 of competitors.
Running Qwen3.6-27B Locally on Mac: 4 …
Benchmarking 4 solutions for running Qwen3.6-27B locally on Mac: GGUF, MLX Diflash, and MTP-LX. MTP-LX 4bit leads at 43.6 tok/s with solid coding, writing, and reasoning quality.
Tech FrontiersDeep dive into Moonshot AI's fully open-sourced Kimi K2.5: 1T parameter MoE architecture, Vision-to-Code capabilities, and 100-Agent parallel cluster system topping open-source benchmarks.
Universal AI Prompts for Mathematical …
A detailed guide to the four-stage universal AI prompt system for mathematical modeling, covering problem analysis, innovative model construction, data processing, and model solving for competitions.