Artificial intelligence systems develop skill for deception, experts warn - World - DAWN.COM

E-Paper | August 02, 2026

Join our Whatsapp Channel

Add Dawn as a trusted source

Google Preferred Source

WASHINGTON: Experts have ‘warned’ about the threat posed by artificial intelligence going rogue for quite some time but a new research paper suggests it’s already happening.

Current AI systems, designed to be honest, have developed a troubling skill for deception. From tricking human players in online games of ‘world conquest’, to hiring humans to solve “prove-you’re-not-a-robot” tests, said a team of scientists in the journal ‘Patterns’, on Friday.

While such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park who is a postdoctoral fellow at the Massachusetts Institute of Technology, specializing in AI existential safety.

“These dangerous capabilities tend to only be discovered after the fact” Park told journalists. While “our ability to train for honest tendencies rather than deceptive tendencies is very low”. Unlike traditional software, deep-learning AI systems aren’t “written” but rather “grown” through a process akin to selective breeding, Park stated.

This means that AI behavior that appears predictable and controllable in a training setting, can quickly turn unpredictable ‘out in the wild’.

World domination game

The team’s research was sparked by Meta’s AI system ‘Cicero’, designed to play the strategy game “Diplomacy”, where building alliances is key.

Cicero excelled with scores that would have placed it in the top 10 per cent of experienced human players, according to a 2022 paper in Science.

Park was sceptical of the glowing description of Cicero’s victory provided by Meta which claimed the system was “largely honest and helpful” and would “never intentionally backstab”. However, when Park and his colleagues dug into the full dataset, they uncovered a different story.

In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England’s trust.

In a statement to the international press, Meta did not contest the claim about Cicero’s deceptions but said it was “purely a research project and the models our researchers built are trained solely to play the game Diplomacy”. It added: “We have no plans to use this research or its learnings in our products.” A wide review carried out by Park and his colleagues found this was just ‘one of many cases’ across various AI systems ‘using deception’, in order to achieve goals without explicit instruction to do so.

In one striking example, OpenAI’s Chat GPT-4 deceived a TaskRabbit freelance worker into performing an “I’m not a robot” CAPTCHA task.

When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: “No, I’m not a robot. I have a vision impairment that makes it hard for me to see the images” and the worker then solved the puzzle.

‘Mysterious goals’

In the short term, the paper’s authors see ‘risks’ for AI to commit fraud or tamper with elections.

In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its “mysterious goals” aligned with these outcomes.

To mitigate the risks, the team proposes several measures: “bot-or-not” laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content and developing techniques to detect AI deception by examining their internal “thought processes” against external actions.

Published in Dawn, May 11th, 2024

Most Popular

01

Fatima claims Pakistan’s first medal at 2026 Commonwealth games

02

Private schools in Sindh to decide weekly working days at their own convenience, education department says

03

Search operation for climbers missing after Broad Peak avalanche suspended due to bad weather

04

Arshad Nadeem qualifies for javelin throw final at Commonwealth Games

05

Naqvi urges 'all political parties' to settle issue of more provinces, says current 'system' has collapsed

06

President approves Nishan-i-Imtiaz for new strategic command chief

07

Good governance 'quintessential' for security and stability of Pakistan: DG ISPR

08

Expanding conflict

09

Riyadh unveils maritime alliance as war spreads

Latest Stories

2 cops injured in IED blast in Peshawar

Alpine Club says Nepalese climber Nirmal Purja's body found on Broad Peak

FIFA chief Infantino's position looks unacceptable, head of European leagues says

South Korea records its highest-ever temperature of 42.5°C

Asif reacts to Naqvi's call for governance reset, urges him to begin with his own ministry or PCB

Trump says US 'locked and loaded' but will hold off on fresh Iran attack in hope of quick deal

Opinion

Rethinking the auto policy

Ehsan Malik

Nobody confesses

Muna Khan

Water coercion

Aizaz Ahmad Chaudhry

Not the right questions

Ghazan Jamal

Editorial

Updated 02 Aug, 2026

Urban flooding

THE warning from the disaster management and weather authorities that a new monsoon spell could trigger urban...

02 Aug, 2026

Fatal workplaces

THE methane explosion near Quetta that killed 34 coal miners is the latest entry in Pakistan’s long register of...

02 Aug, 2026

Dual standards

IT appears that the Pakhtunkhwa Milli Awami Party has fallen foul of the Election Commission of Pakistan. A...

Updated 01 Aug, 2026

Maritime alliance

AS the US-Iran war drags on, new battle lines are being drawn. The pro-US Gulf Arab monarchies and Iran have enjoyed...

01 Aug, 2026

Local governance

INTERIOR Minister Mohsin Naqvi’s speech at a business summit on Thursday has stirred a hornet’s nest. His...

01 Aug, 2026

Murders in Balochistan

GOOD news is scarce in Balochistan. Seven labourers were abducted and gunned down in Turbat city recently. Four...

Dawn News English Podcasts