Page 50 - AI Standards for Global Impact: From Governance to Action
P. 50
AI Standards for Global Impact: From Governance to Action
and context-aware behaviour (contextual performance), taking into account societal factors and
the specific communities affected.
Establishing standards can help bridge the gap between technical metrics and societal values.
Currently, there is no clear international consensus on the definitions and distinctions among
AI agents, agentic AI, physical AI, and embodied AI, particularly from the perspective of
standardization.
AI testing and verification are evolving beyond model capability performance assessment.
Broader, dynamic scopes are now needed from both technical and socio-technical perspectives.
The sharing of best practices in AI testing is very important at the international level. Experience-
based practices alone are insufficient to deal with unknown risks. A call to share logic-based
frameworks across the AI ecosystem was made during the workshop. The context of AI risk
assessment has shifted significantly. Key questions going forward:
• What specific technical and socio-technical AI safety risks must be addressed?
• How should these risks be addressed effectively?
The scope of AI risk management is becoming increasingly complex due to new technologies
such as:
• Agentic AI
• Physical and embodied AI
• Multimodal foundation models
Adopting a "risk-chain model" looks to be essential to understand and address the complex
interactions between AI systems. There is a need for international collaboration on AI testing
to share best practices and assess areas where standards are needed.
User trust in AI systems is vital for their acceptance. Trust comprises elements such as
interpretability, fairness, and reliability, with users' perceptions affecting their trust levels.
Evaluating trust is challenging due to the need to quantify these subjective elements consistently
across various users and contexts. Current AI testing methods face limitations, including
ineffective evaluation frameworks for real-world applications. Many systems excel in labs but
struggle in actual deployments. The lack of standardized processes hinders comparisons across
organizations, undermining AI credibility. Some of the challenges include:
a) Lack of unified standards
b) Lack of comprehensive testing frameworks
c) Transparency and explainability
d) Regulatory differences
e) Data privacy concerns
f) Rapid technological advancements
g) Resource limitations
h) Bias and fairness
International collaboration is vital for creating global AI testing standards, fostering knowledge
sharing and resource integration. Collaborative frameworks can help countries develop best
practices and boost AI credibility and reliability while promoting innovation and sustainable
development.
38