Page 46 - AI Standards for Global Impact: From Governance to Action
P. 46

AI Standards for Global Impact: From Governance to Action



                  Part 2: Thematic AI


                  Standards Workshops







                      6   Trustworthy AI testing and validation

                  The main objectives of the Trustworthy AI testing and validation workshop were:

                  a)   Discuss the research around AI system testing and verification methods
                  b)   Provide an overview of the different methodologies that are used to test and verify AI
                       systems and their strengths/limitations
                  c)   Identify any gaps in current methodologies for AI system testing and verification
                  d)   Explore examples of some of the methodologies and their applications in AI system testing
                       such as Agentic AI testing and LLM security testing
                  e)   Discuss opportunities for international collaboration on AI testing and verification through
                       an international collaborative platform

                  Collaboration will be key in developing a shared understanding of what constitutes trustworthy
                  AI and sharing lessons learnt when it comes to best practices and appropriate technical tools
                  and standards for AI validation and verification. The workshop's main aims were to provide
                  information about the research trends on AI system testing and verification, covering key
                  methods, their strengths, and limitations and the opportunities for international collaboration
                  onfo AI testing.


                  6�1  AI system testing

                  The first session discussed the challenges of AI testing and current research work underway in
                  the field of trustworthy AI testing.

                  Princeton University shared their work on testing autonomous driving. Trust can be placed in
                  AI just as it is in humans. AI becomes trustworthy when models deliver consistent, error-free
                  responses across different environments and make reliable decisions. When users see an AI
                  system behaving predictably and dependably, they begin to trust it – just as they would a
                  reliable person. The first and most critical step towards building trust is ensuring that an AI
                  performs reliably even when faced with unfamiliar data. It is important that AI not only functions
                  in controlled lab settings but also delivers consistent results when applied to real-world data. All
                  too often, we see AI models failing to meet expectations when exposed to real-life conditions,
                  and that undermines trust.

                  To assess the trustworthiness of an AI system, it needs to be examined from the perspectives
                  of different stakeholders and its context or socio-technical ecosystem. This socio-technical
                  "systems view" can help to understand the expected behaviour of the AI system for various
                  input scenarios.

                  For example, in the case of autonomous driving, how should adequate test metrics be defined
                  for AI system testing and what are the various contexts to be taken into consideration for AI
                  safety?. This is a difficult and multi-faceted question, requiring conscious intervention at every




                                                           34
   41   42   43   44   45   46   47   48   49   50   51