#Tensor

61 related articles

2026年5月30日·1 min

Windsurf Integrates Claude Opus 4.7 Fast Mode with 2.5x Speed Boost

Windsurf integrates Claude Opus 4.7 fast mode with 2.5x speed boost while retaining full intelligence. Analysis of its impact on developer productivity and AI coding tool competition.

Tutorial: Deploying a PD-Disaggregated SGLang Multi-Node Inference Cluster on AMD GPUs

Tutorials

2026年5月30日·2 min

Tutorial: Deploying a PD-Disaggregated SGLang Multi-Node Inference Cluster on AMD GPUs

Learn how to deploy a PD-disaggregated SGLang inference cluster on AMD GPUs using a single config file, boosting LLM throughput and latency performance.

SGLang v0.5.12.post1 Released: DeepSeek V4 Stability Fixes and Blackwell Adaptation

Tech Frontiers

2026年5月30日·2 min

SGLang v0.5.12.post1 Released: DeepSeek V4 Stability Fixes and Blackwell Adaptation

SGLang v0.5.12.post1 stability patch details: 12 critical fixes covering DeepSeek V4 garbled text and crashes, NIXL PD disaggregated inference logic, Blackwell B300 adaptation, and cold start optimization.

AMD MI355X Beats B200: Full-Stack Optimization Breakdown for 5% Lower TCO on DeepSeek-R1 Inference

Industry Insights

2026年5月30日·2 min

AMD MI355X Beats B200: Full-Stack Optimization Breakdown for 5% Lower TCO on DeepSeek-R1 Inference

AMD Instinct MI355X achieves 5% lower TCO than NVIDIA B200 on DeepSeek-R1 disaggregated inference via SGLang+MoRI full-stack optimization with 1.25x per-GPU throughput.

SGLang Hosts Agent Loops Office Hour, Focusing on Agentic Loop Architecture Optimization

Tech Frontiers

2026年5月30日·1 min

SGLang Hosts Agent Loops Office Hour, Focusing on Agentic Loop Architecture Optimization

SGLang team hosts an Agent Loops Office Hour exploring inference optimization for agentic loops, covering KV Cache reuse, low-latency multi-turn dialogue, and tool calling techniques.

Industry Insights

Deep Dive into Three Major LLM Career …

2026年5月29日·3 min

Deep Dive into Three Major LLM Career Paths: Requirements, Tech Stacks, and Career Prospects

Deep analysis of three core LLM roles—Application Engineer, Development Engineer, and Algorithm Engineer—covering technical requirements, salary thresholds, and career prospects including RAG, fine-tuning, and inference deployment.

Tutorials

WenzAgent Open-Source Framework: A Pra…

2026年5月29日·2 min

WenzAgent Open-Source Framework: A Practical Tutorial for Multi-Agent Collaborative Management on LAN

A detailed guide on deploying WenzAgent, an open-source multi-Agent management framework under Apache License, supporting LAN-based multi-device AI agent collaboration with Server-Client architecture.

Learning AI Large Language Models from Scratch: A Guide to Learning Paths, Hardware, and Programming Languages

Tutorials

2026年5月28日·2 min

Learning AI Large Language Models from Scratch: A Guide to Learning Paths, Hardware, and Programming Languages

A beginner's guide to learning AI large language models — covering learning paths, hardware requirements, Python essentials, and cloud services for learners at every level.

Meta Muse Spark Released: A Comprehensive Analysis of the Native Multimodal Reasoning Model

Tech Frontiers

2026年5月28日·2 min

Meta Muse Spark Released: A Comprehensive Analysis of the Native Multimodal Reasoning Model

Meta Superintelligence Labs releases Muse Spark, a native multimodal reasoning model supporting visual chain of thought, tool-use, and multi-agent orchestration. Deep dive into its capabilities and competitive positioning.

Coze Beginner's Guide: A Complete Tutorial for Building AI Agents with Zero Code

Tutorials

2026年5月28日·2 min

Coze Beginner's Guide: A Complete Tutorial for Building AI Agents with Zero Code

A detailed guide to Coze AI development platform's core features including agent building, workflow orchestration, knowledge base setup, and plugins — build custom AI apps with zero code.

NVIDIA Dynamo Snapshot: A Snapshot Recovery Solution for GPU Inference Cold Start Problems

Industry Insights

2026年5月27日·2 min

NVIDIA Dynamo Snapshot: A Snapshot Recovery Solution for GPU Inference Cold Start Problems

Deep dive into how NVIDIA Dynamo Snapshot reduces LLM inference cold start time from minutes to seconds via GPU state snapshot and recovery, covering Kubernetes integration and elastic inference.

Qwen Core Team Turmoil, OpenAI and Google Release New Models in Rapid Succession | AI Daily

Tech Frontiers

2026年5月27日·2 min

Qwen Core Team Turmoil, OpenAI and Google Release New Models in Rapid Succession | AI Daily

Multiple core leaders depart Alibaba's Qwen team amid metric disputes. Same day: MiniMax Music 2.5+, OpenAI GPT 5.3 Instant, Google Gemini 3.1 Flashlight, and Seedance 2.0 pricing announced.

DeepSeek OCR2, Kimi K2.5, and Microsoft Maia 200 All Launched on the Same Day

Tech Frontiers

2026年5月27日·2 min

DeepSeek OCR2, Kimi K2.5, and Microsoft Maia 200 All Launched on the Same Day

DeepSeek releases OCR2 replacing CLIP with an LLM as visual encoder; Moonshot AI launches Kimi K2.5 with 100+ sub-agent cluster mode; Microsoft deploys 3nm Maia 200 chip; Alibaba releases Qwen3 Max Thinking.

Getting Started with AI Full-Stack Development: A Knowledge Framework from Machine Learning to Large Language Models

Tutorials

2026年5月27日·3 min

Getting Started with AI Full-Stack Development: A Knowledge Framework from Machine Learning to Large Language Models

A systematic guide to the relationships between AI, machine learning, deep learning, and large language models, helping developers build a clear knowledge framework and find an efficient learning path.

Tutorials

Decoding LLM Naming Conventions: Param…

2026年5月27日·3 min

Decoding LLM Naming Conventions: Parameter Counts, Quantization Formats & VRAM Requirements Quick Reference

Decode LLM naming conventions, understand 32B parameters & AWQ/GGUF quantization formats, with 4-bit VRAM estimation formulas, MOE model pitfalls, and model selection by GPU tier.

NVIDIA Blackwell Sets New STAC-AI Records for Financial LLM Inference

Industry Insights

2026年5月27日·2 min

NVIDIA Blackwell Sets New STAC-AI Records for Financial LLM Inference

NVIDIA Blackwell GPU sets new LLM inference records in STAC-AI financial benchmark. Explore Blackwell architecture advantages, TensorRT-LLM co-optimization, and LLM applications in trading and risk management.

Tutorials

Efficient PyTorch Learning: A Source C…

2026年5月27日·3 min

Efficient PyTorch Learning: A Source Code-Driven Methodology

A proven PyTorch learning method: spend 2-3 days on basics, then advance rapidly by reading U-Net and ViT source code line by line. Master PyTorch through source code-driven learning.

Tutorials

PyTorch Beginner Tutorial: A Complete …

2026年5月27日·3 min

PyTorch Beginner Tutorial: A Complete Guide to Tensor Operations and Neural Network Construction

A detailed PyTorch beginner guide covering tensor operations, dynamic computational graphs, GPU acceleration, and building your first neural network with nn.Module, with learning path recommendations and code examples.

Deep Dive into OpenAI Codex Plugin System: Architecture, Installation, and Hands-On Development

Tutorials

2026年5月27日·2 min

Deep Dive into OpenAI Codex Plugin System: Architecture, Installation, and Hands-On Development

Deep dive into OpenAI Codex plugin system architecture (Skills, Apps, MCP Server), four installation methods, and a macOS app development case study showing how plugins boost AI coding efficiency.

DLSS 4.5 Deep Integration with UE5 and Multilingual AI Characters: Major NVIDIA RTX Game Development Update

Product Reviews

2026年5月27日·3 min

DLSS 4.5 Deep Integration with UE5 and Multilingual AI Characters: Major NVIDIA RTX Game Development Update

NVIDIA releases major RTX update with DLSS 4.5 deep UE5 integration for frame generation performance leaps and multilingual AI characters supporting dynamic dialogue with real-time speech synthesis.