30 related articles
TutorialsComplete guide to building a local AI knowledge base with Qwen3.5, RAGFlow, and Ollama, covering Docker deployment, Embedding model configuration, knowledge base creation, and RAG system setup.
TutorialsComplete guide to deploying Stable Diffusion locally. Covers hardware requirements, one-click installation, and model setup. Run AI image generation free with 8GB RAM.
Product ReviewsDetailed review of Hertzman local inference engine covering one-click deployment, smart hardware recommendations, OpenAI-compatible API, and performance comparison with LM Studio.
TutorialsComplete guide to deploying Stable Diffusion locally, covering hardware requirements, one-click installation, and model management. Free, unlimited, fully offline AI image generation for creators and privacy-conscious users.
TutorialsLearn how to configure a local DeepSeek model in PyCharm via Ollama for free, privacy-safe AI-assisted programming. Includes installation steps, plugin setup, usage tips, and hardware recommendations.
Tech FrontiersLiquid AI releases LFM2.5-8B-A1B, a MoE model with 8B total params but only 1.5B active, matching 6B-class models in tool calling. Supports 128K context, local deployment, multilingual, with SGLang Day-0 support.
DeepSeek V4 Flash MTP Speculative Deco…
Real-world testing of DeepSeek V4 Flash with MTP speculative decoding: ~20% speedup for code generation, minimal gains for text. Covers memory overhead, accuracy differences, Q4 vs Q3 quantization, and full deployment tutorial.
TutorialsLearn how to redirect Claude Agent SDK API requests to local LLMs via LiteLLM Proxy, achieving zero-cost inference while retaining full agent framework capabilities.
Local Deployment of Qwen 3.6 27B on 4×…
Real-world test of Qwen 3.6 27B FP8 deployed on 4×3080Ti 16GB modded GPUs with OpenCode for system tool development. Covers hardware setup, inference speed, context management, and productivity gains.
Complete Guide to Local LLM Deployment…
Complete guide to deploying open-source LLMs locally with Ollama. Covers installation, model selection, VRAM requirements, and performance comparison of Llama 3 and Qwen models. Free, offline-capable AI.