Committed to connecting the world


Workshop on Artificial Intelligence for Health Standardized Assessment Framework - Handling and Assessment Methods

ITU & WHO logos
​​Fraunhofer HHI
Berlin, Germany, 8-9 Jan 2020

Workshop organizers
Fraunhofer Institute for Telecommunications
Einsteinufer 37
10587 Berlin

​Purpose ​​​of the Workshop

The man​date of the ITU/WHO FG-AI4H is to establish a standardized assessment framework for the evaluation of AI-based methods for health, diagnosis, triage or treatment decisions. The purpose of this first workshop in the AI4H statistics and benchmarking series was to take the next technical steps in (a) setting up a sandbox for the assessment framework and (b) defining testing procedures for quality assessment of data sets and AI models. Furthermore, the workshop was an opportunity to learn more about FG-AI4H as well as the people and institutions that are involved in its day-to-day work.


This workshop was targeted at profession​als either specializing or working on the intersections of the health, ICTs and AI domains. This included individuals from academia (postdocs, PhD students, professors) and industry (developers, consultants, doctors, healthcare workers, freelancers). Please note this was not a non-exhaustive list and people not fitting into these groups but had a strong interest to contribute, were invited to reach ​o​ut to the organizers. Participation and contribution was possible in different capacities.


Some activities started at the workshop continue in virtual meetings and successive physical meetings. This is the normal process for our Focus Group: most of the work is done through virtual meetings to be economical with emissions and time budgets. Physical meetings are organized regularly to facilitate interactions that are hard to replicate online as well as providing a point of entry for new members.

The following activities and topics were dealt with at the workshop:

1.   Set up the sandbox for the assessment platform based on an existing software library
Comment: This means we will start to set up the existing software for a benchmarking platform on a server.
2.   Collect technical expertise and future requirements for the platform with respect to

(a) Data processing management
(b) Encryption
(c) Data set splitting

3.   Collect technical expertise and requirements for methods and tests with respect to

(a)  Assessment of test data quality, in particular

i.    Completeness
ii.   Heterogeneity and varying quality grades of measurements
iii.  Robustness as a property of data
iv.  Characteristics of data sets, summary statistics, “data sheets for data sets"
v.   Bias
vi.  Fairness
vii. Integrity
viii. Reproducibility of data collection procedure

(b) Assessment of AI model quality, in particular safety and reliability dimensions such as

i.    Robustness
ii.   Generalizability
iii.  Uncertainty quantification
iv.  Explainability

Participants were asked to indicate in the online form the activities in which they were most interested and whether they would be able to give a short presentation on the aspects that, from their perspective, are important to consider with respect to the quality assessment of AI for health applications and data sets.

Structure of the Wor​​​kshop

The final program of the workshop is available here​.

Background on ITU/WHO FG-AI​4H

Health technologies require careful evaluation under consideration of both technical and health-related aspects, prior to wider usage. Several factors complicate this evaluation and, thus, the deployment of artificial intelligence (AI) solutions in the health context. Therefore, the International Telecommunication Union (ITU) and the World Health Organization (WHO) have initiated an international standard-seeking effort to address these challenges by creating a joint Focus Group on Artificial Intelligence for Health (FG-AI4H). The ITU and the WHO are two specialized agencies of the United Nations authorized for creating global standards in the fields of information technology and health, respectively. The mandate of the ITU/WHO focus gr​oup is to undertake crucial, exploratory steps towards evaluation standards that are applicable on a global scale. FG-AI4H has begun working towards establishing a rigorous evaluation process for AI4H solutions, under the supervision of ITU and WHO, with a global community of experts from health, machine learning, AI, both from academia and industry, and regulation.

Detailed information on the work FG-AI4H is doing can be found on the website, a commentary in The Lancet and the following white paper. Furthermore, all documentation can be accessed via the online collaboration system (a free ITU account is needed; see instructions for help).​