Anthropic says most AI models, not just Claude, will resort to blackmail

June 20, 2025

(TechCrunch) – Several weeks after Anthropic released research claiming that its Claude Opus 4 AI model resorted to blackmailing engineers who tried to turn the model off in controlled test scenarios, the company is out with new research suggesting the problem is more widespread among leading AI models.

On Friday, Anthropic published new safety research testing 16 leading AI models from OpenAI, Google, xAI, DeepSeek, and Meta. In a simulated, controlled environment, Anthropic tested each AI model individually, giving them broad access to a fictional company’s emails and the agentic ability to send emails without human approval. (Read More)

Posted by Bioethics Pundit

Posted in Artificial Intelligence, Emerging Technologies, highlights, News