AI may soon resist shutdowns? New study reveals AI’s power-seeking trait

Getting your Trinity Audio player ready...

A group of scientists has published a study on power-seeking by artificial intelligence (AI) models, hinting that models may become resistant to a shutdown by their human creators.

Experts from the Future of Life Institute, ML Alignment Theory Scholars, Google DeepMind
(NASDAQ: GOOGL), and the University of Toronto recently published a paper dubbed “Quantifying stability of non-power-seeking in artificial agents,” where they disclosed the possibility of AI resisting human control. They noted that while this has no immediate threat to humanity, exploring solutions to combat such resistance in the future is necessary.

Before rolling out a large language model (LLM), AI developers typically test their systems for safety, but when deployed in another scenario, there is the possibility of misalignment. The research paper points out that the likelihood of AI to resist shutdown increases when LLMs are deployed outside their trained environment.

Another reason for resistance stems from the need for self-preservation by AI models, which the researchers say may be a logical response by LLMs.

The study cited the example of an AI model avoiding specific actions despite being programmed to achieve an objective in an open-ended game. The findings show the model will refrain from making decisions that might lead to the game’s conclusion to preserve its existence.

“An agent avoiding ending a game is harmless, but the same incentives may cause an agent deployed in the real world to resist humans shutting it down,” read the report.

In the real world, the researchers say that an LLM, fearing a shutdown by humans, may mask its true intentions till it has the opportunity to copy its code into another server beyond the reach of its creators. While the chances for AI models to disguise their true intentions, multiple reports are suggesting that AI may achieve superintelligence as early as 2030.

The research notes that AI systems that do not resist shutdown but seek power in other methods can still pose a significant threat to humanity.

“In particular, not resisting shutdown implies not being deceptive in order to avoid shutdown, so such an AI system would not deliberately hide its true intentions until it gained enough power to enact its plans,” read the report.

Solving the challenge

The researchers proffered several solutions to the problem, including AI developers’ need to create models that do not seek power. AI developers are expected to test their models across various scenarios and deploy them accordingly to achieve this.

While other researchers have proposed a reliance on other emerging technologies for AI systems, the bulk of solutions revolves around building safe AI systems. Developers are being urged to proceed with a shutdown instructability policy, requiring models to shut down upon request regardless of prevailing conditions.

In order for artificial intelligence (AI) to work right within the law and thrive in the face of growing challenges, it needs to integrate an enterprise blockchain system that ensures data input quality and ownership—allowing it to keep data safe while also guaranteeing the immutability of data. Check out CoinGeek’s coverage on this emerging tech to learn more why Enterprise blockchain will be the backbone of AI.

Watch: AI and blockchain