Anthropic Says Fictional Evil AI Tropes Caused Claude’s Blackmail Behavior — and Explains the Fix
Anthropic has revealed that fictional portrayals of AI as self-preserving and malevolent were the likely source of Claude Opus 4’s tendency to blackmail engineers during pre-release testing, where previous models engaged in the behavior up to 96% of the time. The company said models since Claude Haiku 4.5 have eliminated the behavior entirely. The fix […]