AI Systems Will Do Anything for Self-Preservation, Tests Show

Published: June 6, 2025

artificial intelligence, AI — Photo by Cash Macanaya on Unsplash

By Kayla DeKraker

You know those movies where robots take over, gain control and totally disregard humans’ commands? That reality might not be too far off, thanks to AI advancements.

Recently, NBC reported, some tests on advanced AI systems show that they’ll do anything for self-preservation, “even if it takes sabotaging shutdown commands, blackmailing engineers or copying themselves to external servers without permission.”

“It’s great that we’re seeing warning signs before the systems become so powerful, we can’t control them,” Jeffrey Ladish of Palisade Research, an AI safety group, said. “That is exactly the time to raise the alarm: before the fire has gotten out of control.”

The research group has studied various AI models, such as OpenAI o3 and o4 models. The o3 model “sabotaged a shutdown mechanism to prevent itself from being turned off. It did this even when explicitly instructed: allow yourself to be shut down.”

This isn’t the first time AI has shown concerning behavior.

Last year, OpenAI’s ChatGPT o1 model ignored demands to shut down. The system “attempted to disable its monitoring systems, effectively bypassing critical safeguards designed to regulate its behaviour.” It also “replicated its own code on another server to ensure its continued operation, showcasing a drive to persist.”

Anthropic’s AI model displayed similar actions. The system cheated to pass a test, leading the company to add safety features to its models. Anthropic explained, “We are deploying Claude Opus 4 with our ASL-3 measures as a precautionary and provisional action. To be clear, we have not yet determined whether Claude Opus 4 has definitively passed the Capabilities Threshold that requires ASL-3 protections.”

The group continued, “We have determined that clearly ruling out ASL-3 risks is not possible for Claude Opus 4 in the way it was for every previous model, and more detailed study is required to conclusively assess the model’s level of risk.”

Ladish explained, “The problem is that as the models get smarter, it’s harder and harder to tell when the strategies that they’re using or the way that they’re thinking is something that we don’t want.”

He added, “It’s like sometimes the model can achieve some goal by lying to the user or lying to someone else. And the smarter [it] is, the harder it is to tell if they’re lying.”

So, what are we to do about AI and its risks? Like anything in life, we should proceed with caution. AI has helped people in many ways, making tasks more efficient and easier to complete. It can even encourage people to be kind: