#LLM fine-tuning

4 related articles

2026年6月6日·2 min

LlamaFactory: A Comprehensive Guide to the Open-Source Framework for Unified Fine-Tuning of 100+ LLMs

Deep dive into LlamaFactory, an open-source unified fine-tuning framework supporting 100+ LLMs and VLMs with LoRA, QLoRA, RLHF methods, Web UI, 71K+ GitHub Stars, accepted at ACL 2024.

Complete Guide to LLM Training: Pre-training, SFT Fine-tuning, and Preference Alignment Explained

Deep Dives

2026年6月3日·3 min

Complete Guide to LLM Training: Pre-training, SFT Fine-tuning, and Preference Alignment Explained

Complete guide to the three core LLM training stages: pre-training, supervised fine-tuning (SFT), and preference alignment (DPO/PPO), covering LoRA, distillation, quantization, and pruning.

Essential Skills for LLM Engineers: A Complete Guide to Application Development and Fine-Tuning

Tutorials

2026年6月2日·1 min

Essential Skills for LLM Engineers: A Complete Guide to Application Development and Fine-Tuning

A systematic guide to LLM engineer core skills covering RAG, Agent app development and SFT, RLHF fine-tuning, with clear learning paths for different backgrounds.

Hermes Agent in Practice: A Complete Breakdown from ReAct Loop to Autonomous Skill Evolution

Tutorials

2026年6月1日·3 min

Hermes Agent in Practice: A Complete Breakdown from ReAct Loop to Autonomous Skill Evolution

Deep dive into Hermes Agent's four progressive cases: terminal ReAct loop, Feishu AI assistant, four-layer persistent memory, and three-stage Skill evolution with DeepSeek support.