Page 29 - Shaping ethics, regulation and standardization in AI for health
P. 29
Shaping ethics, regulation and standardization in AI for health
A�3�10 DEL 7�2: Artificial intelligence technical test specification
Summary: This document specifies how an AI can and should be tested in silico. Among other
aspects, best practices for test procedures known from (but not exclusively) AI challenges are
being reviewed in this document. Important testing paradigms that are not exclusively related
to AI applications are also mentioned.
A�4 Clinical evaluation and use cases
A�4�1 DEL 7�4: Clinical evaluation of AI for health
Summary: Artificial intelligence (AI) in healthcare could hold great promise to improve people's
health worldwide by transforming screening, diagnosis, therapy and monitoring of diseases.
The increasing amount and availability of digitized health data has facilitated the use of AI
which can be used to analyse large datasets, provide new insights, and identify patterns in
seen and unseen data. There are already many potential applications for AI in medicine and
considering the factors such as the global shortage of healthcare professionals, changing
population demographics worldwide, and the ongoing global digital transformations there is
huge interest in the potential of AI systems in both high- and low-resourced settings. Achieving
the potential beneficial impact requires frameworks for evaluating AI systems, in order to ensure
that they are safe, effective, and useful and that they do not cause unanticipated harm when
applied to a complex clinical pathway or when used autonomously, and that the costs and
ethics are adequately considered.
The adoption of effective, safe, ethical, inclusive, and fair AI systems into health systems is a
global concern that requires input from a wide range of stakeholders. Clinical evaluation of AI
systems including their underpinning data, performance, safety, and transparent communication
of these results are critical for delivery.
Working from the principles of evidence-based medicine but acknowledging the particular
challenges and opportunities of AI-based technologies, this report provides a framework for
the evaluation of AI systems in health that can be used by clinicians, researchers, developers,
regulators, health systems, and policymakers to understand whether a particular AI system
is likely to be effective and safe in their setting. It was developed by members of the FG-
AI4H [4] Working Group on Clinical Evaluation and is part of a series of guideline documents
(deliverables) produced by FG-AI4H. In keeping with the WHO stated goal to 'leave no one
behind' the group gave special considerations to low resourced settings when creating the
framework and recommendations that draw on current best practices and also identify potential
gaps for future research.
The framework for clinical evaluation divides evaluation into four phases: evaluation of model
purpose and suitability, algorithmic validation, clinical validation, and ongoing monitoring while
also drawing attention to the essential requirements of ethical and economic evaluation that
cut across the four phases.
Evaluation of model purpose and suitability requires:
– an understanding of the problem and the intended use of the AI system
– a definition of the intended benefits
– a description of the potential risks and harms
19