Page 30 - Shaping ethics, regulation and standardization in AI for health
P. 30

Shaping ethics, regulation and standardization in AI for health



                  –    documentation of interoperability and security
                  –    user testing and user engagement reports.

                  Algorithmic validation (used here to refer to the evaluation of the AI system 'in silico' requires:

                  –    a description of the data used for development
                  –    internal and external testing, and of the model type used
                  –    reporting of performance metrics in the internal and independent external testing data
                  –    benchmarking of system performance against standard of care, and where relevant, other
                       AI systems.

                  Clinical validation (for the purposes of this Technical Report this is the evaluation of the AI system
                  through interventional or clinical studies) requires:
                  –    a clinical study with a relevant comparator and a meaningful endpoint, and the steps taken
                       to minimise bias.
                  Finally, deployment and ongoing evaluation requires:

                  –    monitoring of performance and impact (including safety and effectiveness) to understand
                       the anticipated and unanticipated outcomes
                  –    algorithmic audits [5] to understand how adverse events or algorithmic errors occur.

                  Annex A of the deliverable summarizes the key findings as a checklist to facilitate the application
                  of this Deliverable.


                  A�4�2  DEL 10: AI4H use cases: Topic description documents

                  Summary: This document provides an overview of the ITU/WHO Focus Group on AI for Health
                  (FG-AI4H) "AI4H use cases: Topic Description Documents". Each use case is represented by
                  a topic group that is dedicated to a specific health topic in the context of AI. The topic group
                  proposes a procedure to benchmark AI models developed for a special task within this health
                  topic. All members of a topic group create a TDD that contains information about the structure,
                  operations, features, and considerations of the specific health topic. This document serves as
                  an introduction to the topic groups and their topic description documents.

                  A�4�3  DEL 10�2: FG-AI4H Topic Description Document for the Topic Group
                          on AI-based dermatology (TG-Derma)

                  Summary: This TDD specifies a standardized benchmarking for AI-based dermatology. It covers
                  all scientific, technical, and administrative aspects relevant for setting up this benchmarking.

                  The Group defines specific AI tasks such as skin disease classification, lesion segmentation,
                  disease severity assessment, and treatment recommendation, detailing target conditions,
                  datasets, and evaluation metrics to guide model development and evaluation. They also
                  discuss gold standards, including expert consensus and standardized image datasets, to
                  benchmark AI performance, ensuring consistent, reproducible, and collaborative advancements
                  in dermatological AI applications. The group's efforts in defining AI tasks and discussing gold
                  standards provided a clear roadmap for developing and evaluating AI models that address critical
                  challenges in dermatological diagnosis and patient care. These definitions are continuously
                  refined to stay aligned with the latest advancements in AI for dermatology.






                                                           20
   25   26   27   28   29   30   31   32   33   34   35