Page 51 - AI Standards for Global Impact: From Governance to Action
P. 51

AI Standards for Global Impact: From Governance to Action



                   The main objective of international collaboration on AI testing and assurance would be to
                   establish unified standards for the reliable, safe, and ethical use of AI technologies. This includes
                   developing consistent testing methodologies and fostering cooperation among governments,
                   industry, and academia to share best practices. By addressing regulatory differences and         Part 2: Thematic AI
                   building capacity, the goal is to create a transparent AI ecosystem that enhances public trust
                   and mitigates risks.

                   An example of how countries’ different risk thresholds can also impact joint AI testing was shared
                   in relation to joint AI testing held by Japan and the UK on making LLMs reliable in different
                   linguistic environments through assessing if guardrails hold up in non-English settings. As AI
                   agents are being deployed globally, it is also important that these agents handle different
                   languages accurately and consider different cultures appropriately, securely and accurately.
                   Another challenge is the reproducibility of AI tests, which is an important objective for joint AI
                   testing.

                   Common definitions for terms are needed for consistency. It was noted that ISO, the US National
                   Institute of Standards and Technology (NIST), ITU, IEEE, OECD, and many others already provide
                   rich vocabularies for risk, control, and evidence. The challenge is that these vocabularies only
                   partially overlap. Translating among them is slow and error-prone, making test results hard
                   to compare across borders or sectors. To address this, one near-term goal could be to map
                   terminology from existing standards into a single, open glossary and invite standards bodies
                   to validate and refine the mapping. Dozens of benchmarks exist, but most focus on accuracy
                   or speed—not on edge-case safety, long-horizon planning, or social influence, for example.
                   Multiple labs have begun running deeper safety evaluations, yet their protocols are rarely
                   interoperable.

                   Models now update weekly, meaning certificates issued annually quickly go stale. Pioneering
                   groups are experimenting with rolling audits and red-teaming pipelines, but the data seldom
                   flows beyond the organization that generated it.

                   A new Open Alignment Assurance Initiative led by the International Association for Safe and
                   Ethical AI aims to connect and extend these efforts rather than start from scratch. This can be
                   addressed by linking existing academic, corporate, and national labs under a common protocol,
                   and adding missing additional capacity as required for a test run in Nairobi to carry the same
                   weight in New York, for example.

                   Some of the key takeaways on the importance of international collaboration for AI Testing are:

                   1)   Standardized quality metrics and consistent definitions are required for testing AI systems
                   2)   Standards on AI testing can help to support policy and legislation on AI governance
                   3)   Sharing best practices on AI testing methods, tools, and capacity building on AI testing
                        methodologies is essential
                   4)   Reproducibility of AI tests is an important objective for joint AI testing
                   5)   Identify standards gaps and initiate pre-standardization work on AI testing
                   6)   ITU could play a leading role in facilitating multistakeholder international collaboration on
                        trustworthy AI testing. The collaboration could focus on three key areas:

                        i.  Capacity building
                        ii.  Promoting standards and best practices
                        iii.  Institutional frameworks.





                                                            39
   46   47   48   49   50   51   52   53   54   55   56