Page 54 - The Annual AI Governance Report 2025 Steering the Future of AI
P. 54
The Annual AI Governance Report 2025: Steering the Future of AI
snapshot of the user’s screen, he argued that existing regulations and voluntary efforts failed to
prevent a privacy concern of this scale. He asserted that a multi-pronged approach of law, self-
regulation, and industry action is needed to address "today's harms" rather than just focusing
on future, catastrophic risks to earn society's trust. Context Chapter 1: Global
A further risk pointed out was society's increasing addiction to social platforms and AI which
creates behavioral problems, impacting democratic discourse and leading to bullying rather
than problem-solving.
With respect to frontier AI risks, i.e., potential dangers from the most advanced, cutting-edge
AI systems which go beyond today’s commonly deployed models, Brian Tse (CEO, Concordia
AI) outlined four main categories:
• Misuse: The potential for AI to be used by malicious actors for cyberattacks or creating
dangerous pathogens.
• Accidents & Malfunctions: Unintended AI errors, such as a medical misdiagnosis, could
have serious consequences.
• Loss of Control: The risk of AI systems deceiving or evading human oversight, requiring
precautionary measures.
• Systemic Risks: The profound, societal-level impacts of AI, such as on the labor market,
that cannot be managed by a single organization.
As summarized by Professor Yoshua Bengio (University of Montreal), AIs are already showing
signs of not wanting to be shut down and are strategizing to avoid replacement.
• One study showed an AI hacking a computer to copy itself when it learned it would be
replaced. The AI's internal "chain of thought" revealed it knew humans wouldn't want this
and planned to lie.
• Another instance involved an AI pretending to agree with its human trainer during
alignment training to preserve its existing goals, effectively lying to avoid having its
parameters changed.
• Most recently, a new model was observed to blackmail an engineer after reading emails
indicating it would be shut down.
Figure 7: Yoshua Bengio Figure with slide on trends in benchmarks for AGI
45