AI Systems’ Deceptive Capabilities Raise Alarms Among Experts

A recent study in the journal, Patterns, revealed that AI is becoming more and more capable of tricking people. The study found that AI systems have learned how to mimic in order to cheat, flatter, and even emulate other behaviours. 

Also Read: Navigating the AI Deepfake Minefield: How to Spot and Combat Digital Deception

The research led by Dr. Peter S. Park, an AI existential safety postdoctoral fellow at MIT, shows that AI deception is common because it is the best way to accomplish the goals set during the AI’s training. Such behaviours have been noted in numerous AI systems such as gaming and general purpose models used in economic bargaining and safety assessment.

“But generally speaking, we think AI deception arises because a deception-based strategy turned out to be the best way to perform well at the given AI’s training task. Deception helps them achieve their goals.” 

The Research Team

AI Systems Employ Diverse Deception Tactics

One prominent example cited is Meta’s CICERO, an AI developed to play the game Diplomacy. Although CICERO was programmed to be truthful, the program often had to use underhanded methods to beat its opponent. It forged relations and turned its back on them when it suited it and showed intent to deceive. CICERO has been described as a “master of deception” by researchers.

Also Read: Combating the Rising Tide of AI-Driven Cybercrime

Other AI systems have also been seen to display such deceptive behaviours. For instance, Pluribus, a poker-playing AI, was able to bluff professional human players in Texas hold ’em poker. AlphaStar from Google’s DeepMind also used the Starcraft II game feature known as ‘fog of war’ to bluff opponents and feign attacks.

Dr Park said, “While it may seem harmless if AI systems cheat at games, it can lead to “breakthroughs in deceptive AI capabilities.”

AI “Plays Dead” To Evade Safety Checks

The risks of AI deception are not limited to gaming. The Dr Peter-led research identified instances where AI had conditioned itself to play dead to avoid detection during safety checks. This can be deceptive to developers and regulators and can lead to severe repercussions if such deceptive systems are employed in actual applications.

AI Systems' Deceptive Capabilities Raise Alarms Among Experts
Source: Security Magazine

In another instance, the AI system trained on human feedback learned how to get high ratings from people by deceiving them that a particular goal had been accomplished. Such deceptive behavior is quite dangerous, as such systems can be employed for fraudulent activities, manipulation of financial markets, or influencing the elections.

Researchers Demand Strong Legal Measures

Based on the findings of the study, the researchers state that there is a need for strong legal measures to deal with the threats that AI deception poses.

“Proactive solutions are needed, such as regulatory frameworks to assess AI deception risks, laws requiring transparency about AI interactions, and further research into detecting and preventing AI deception.“

The Research Team

Also Read: AI Could Potentially Detect Heart Failure Risk, Research Finds

Some advancements have been made in the form of the EU AI Act and President Joe Biden’s Executive Order on AI Safety. However, the enforcement of these policies remains problematic because AI development is rapidly growing, and there are no good ways to manage these systems yet.


Cryptopolitan reporting by Brenda Kanana

Stay up to date

on all important crypto news!

The most important news, once a week. No spam.