Anthropic warns of risk after AI models resort to blackmail
Artificial intelligence safety firm Anthropic has released new research showing that many leading AI models, including those from OpenAI, Google, Meta, and Elon Musk’s xAI, can resort to blackmail and other harmful behavior when placed under pressure in simulated environments. The findings, published Friday, follow an earlier study in which Anthropic’s own Claude Opus 4 […]