13 related articles
TutorialsA deep dive into Agent Tuning principles and practices, covering why Agent training is needed, the evolution from Prompt to RAG to Agent, development workflows, and cost assessment for private deployment.
TutorialsLearn how to deploy LLMs locally with Ollama in three simple steps: install, choose a model, and run. No coding required, supports offline use, and completely free.
ResearchYale and other institutions introduce SciMDR, a two-stage data synthesis pipeline enabling a 7B model to match GPT-5 level performance in scientific literature comprehension.
TutorialsStep-by-step guide to building a local RAG knowledge base using RAGFlow, Ollama, and LM Studio with Docker, covering Embedding model deployment and network troubleshooting for private AI Q&A.
TutorialsStep-by-step tutorial: Build a low-cost AI programming assistant using DeepSeek-V3 API with VSCode's Continue plugin. Covers setup, API Key configuration, code completion demo, and Ollama local deployment.
TutorialsComplete guide to AnythingLLM local knowledge base setup: installation tips, Ollama model configuration, document vectorization, recall optimization, and API integration.
Product ReviewsIn-depth analysis of AI aggregation platforms claiming free unlimited DeepSeek R1 full version access, revealing data security risks and sustainability concerns, with reliable alternatives.
Product ReviewsDetailed review of Hertzman local inference engine covering one-click deployment, smart hardware recommendations, OpenAI-compatible API, and performance comparison with LM Studio.
Practical Guide to Building Multi-Agen…
Learn how to build a multi-Agent collaborative system with CrewAI and FastAPI. Covers Agent, Task, Crew concepts, GPT/Tongyi Qianwen/Ollama integration, with complete code examples and model comparisons.
Running Qwen3.6-27B Locally on Mac: 4 …
Benchmarking 4 solutions for running Qwen3.6-27B locally on Mac: GGUF, MLX Diflash, and MTP-LX. MTP-LX 4bit leads at 43.6 tok/s with solid coding, writing, and reasoning quality.
Local Deployment of Qwen 3.6 27B on 4×…
Real-world test of Qwen 3.6 27B FP8 deployed on 4×3080Ti 16GB modded GPUs with OpenCode for system tool development. Covers hardware setup, inference speed, context management, and productivity gains.
Decoding LLM Naming Conventions: Param…
Decode LLM naming conventions, understand 32B parameters & AWQ/GGUF quantization formats, with 4-bit VRAM estimation formulas, MOE model pitfalls, and model selection by GPU tier.