Page 87 - AI Standards for Global Impact: From Governance to Action

P. 87

AI Standards for Global Impact: From Governance to Action

for pharmaceuticals these are {animal models + human trials, efficacy and safety test
criteria, and well-documented statistical evidence}.
19) A simple way to think about this issue: When agents fail in deployment, it is often because
people treat them like employees, giving them poorly scoped tasks and lots of freedom. AI Part 2: Thematic AI
agents should rather be treated as regulated contractors, (1) given very explicitly specified
tasks, (2) allowed to deliver on those, and (3) required that they show that their actions
comply with the specification given.

o Alignment with societal values is not enough to ensure good outcomes. If it were,
every time an engineering manager and product manager disagreed about a design
decision, the disagreement could be seen as a result of misaligned societal values.
(Some decisions are normative beliefs that require crossing the is/ought chasm.)
o In software this would look like verifying code against a formal specification, but this
formalization process has been applied to automate review of compliance with the tax
code (catala-lang.org), building codes (symbium.com), financial regulation (imandra.
ai), and need-to-know (knostic.ai/).
20) There are many different types of agents and their specific value lies in their autonomy,
which implies a reduction of human oversight (HITL Principle). This naturally triggers the
risk of misalignment.
21) To mitigate such risk, adaptive trust mechanisms are needed, embedded in dynamic
human-AI interaction frameworks that employ dynamic human intervention thresholds
to adjust the level of human oversight according to risk, confidence, and context. That
is automation of low-risk agent decisions, while high-risk decisions continue to require
human validation, including a clear audit trail. Therefore, this is essentially not different
from traditional AI governance concepts. Importantly, under current legal frameworks, the
accountability for AI agents still rests with humans.

Figure 41: Multi-agent security standards 10
22) At the same time, even if there is an appropriate/dynamic level of human oversight,
there are threat models targeting human cognitive limitations and compromising
interaction frameworks. One example would be when attackers are attempting to exploit

10 Source: Ant Group presentation: https:// s41721 .pcdn .co/ wp -content/ uploads/ 2021/ 10/ Session -1 _Xiaofang
_Final -version .pdf

82 83 84 85 86 87 88 89 90 91 92