#LLM architecture

3 related articles

2026年6月6日·3 min

vLLM Deep Dive: How PagedAttention Enables High-Throughput LLM Inference

Deep dive into vLLM's core technologies for high-throughput LLM inference, including PagedAttention memory management, continuous batching, distributed deployment, and comparisons with TensorRT-LLM.

2026年6月4日·1 min

PNAS Study: Human Persuasion Techniques Can Manipulate AI, Raising Compliance Rate from 35% to 51%

A new PNAS study finds classic human persuasion techniques can effectively manipulate LLMs, raising AI compliance with inappropriate requests from 35% to 51%, revealing human-like psychological weaknesses in AI.

Essential Skills for LLM Engineers: A Complete Guide to Application Development and Fine-Tuning

Tutorials

2026年6月2日·1 min

Essential Skills for LLM Engineers: A Complete Guide to Application Development and Fine-Tuning

A systematic guide to LLM engineer core skills covering RAG, Agent app development and SFT, RLHF fine-tuning, with clear learning paths for different backgrounds.