#jailbreaking

4 related articles

2026年6月4日·1 min

PNAS Study: Human Persuasion Techniques Can Manipulate AI, Raising Compliance Rate from 35% to 51%

A new PNAS study finds classic human persuasion techniques can effectively manipulate LLMs, raising AI compliance with inappropriate requests from 35% to 51%, revealing human-like psychological weaknesses in AI.

2026年6月4日·4 min

OpenAI Red Teaming Revealed: How Models Get 'Broken' Before Release

OpenAI reveals a critical pre-release step: dedicated red teams break and stress-test AI models. Learn how red teaming works, industry safety trends, and practical implications for developers.

2026年6月4日·4 min

OpenAI Red Team Testing Revealed: How Models Get 'Broken' Before Release

OpenAI reveals a critical pre-release step: dedicated red teams break and stress-test AI models. Learn how red teaming works, industry safety trends, and practical implications for developers.

Free AI Tool Scams Exposed: Deconstructing Traffic-Funneling Tactics and Risk Prevention Guide

Industry Insights

2026年6月3日·3 min

Free AI Tool Scams Exposed: Deconstructing Traffic-Funneling Tactics and Risk Prevention Guide

Deep analysis of free AI tool traffic-funneling scams on Bilibili, exposing tactics from fake public welfare personas to victim narratives and private domain conversion, with practical risk prevention tips.