Page 91 - ITU KALEIDOSCOPE, ATLANTA 2019
P. 91

ICT for Health: Networks, standards and innovation




           health application. The deliverables of the WGs are planned   topics and the specific problems involved with a number of
           to be a number of documents that cover topics including:   AI for health tasks and data modalities. At present, the topic
               •  AI ethical considerations,                  groups  address  AI-based  cardiovascular  disease  risk
               •  AI legal consideration,                     prediction, dermatology, histopathology, outbreak detection,
               •  AI software life-cycle,                     ophthalmology,   radiotherapy,   symptom   assessment,
               •  reference data annotation specification,    tuberculosis  prognostics/diagnostics  and  several  further
               •  training and test data specification,       domains.  In  each  topic  group,  different  stakeholders,
               •  AI training process specification,          including competing companies, with a common interest in
               •  AI test process specification,              the  topic  are  working  together.    “Calls  for  topic  group
               •  AI test metric specification, and           participation”   are   published   on   the   website
                                                              (https://www.itu.int/go/fgai4h),  introduce  the  respective
               •  AI post-market adaptation and surveillance  topic group and invite participation. The creation of many
                  specification.                              other topic groups in response to the open “call for proposals:
                                                              use cases, benchmarking, and data” is expected. Selection
           An overview of the technical output of the WGs is given in   criteria include the prospect for a widespread and, ideally,
           Figure 2.                                          global impact, a clear concept described in sufficient detail,
                                                              and preliminary evidence for feasibility.

                                                              Every topic group defines its scope, the specific ML/AI tasks
                                                              and the evaluation procedures with corresponding test data
                                                              and metrics in a topic description document in full detail.
                                                              Statistical metrics for assessing the model performance are,
                                                              e.g. precision, specificity, F1 score and area under curve, but
                                                              can be multiple or combined metrics too [61]. In particular,
                                                              it  should  be  assured  that  the  (e.g.  clinical)  endpoints  are
            Figure 2 − Overview of the technical output of the WGs   meaningful in practice. Further criteria should be considered,
                                                              e.g. robustness to noise and to other variations in the input
           The WG data and AI solution assessment methods reviews   data  [62],  or  to  manipulations  [65].  Humans  prefer
           the topic description documents (see below), in collaboration   transparent  decision-making:  Can  the  model  adequately
           with  independent  experts  with  substantial  records  of   quantify  the  uncertainty  [63]  and  plausibly  explain  the
           accomplishment  in  the  respective  health  topic,  with   decision [66, 67]? These criteria beyond mere performance
           proficient  knowledge  in  ML/AI,  and  with  transversal   should also be considered.
           competences from areas such as ethics and statistics. During
           a repeated review cycle, the working group and the experts   The  topic  description  document  must  capture  a  range  of
           check  that  the  topic  description  documents  are  accurate,   aspects  related  to  the  test  data,  because  they  determine
           complete,  sound,  understandable  and  objective,  and  give   largely  if  the  evaluation  procedure  is  appropriate  and
           according feedback for improvement to the respective topic   meaningful. The procedure can return conclusive results if,
           group and the entire focus group. The WG is in charge of   and only if, the test data are realistic, i.e. close to the actual
           providing a number of technical deliverables, given above.   application,  of  representative  coverage,  and  of  traceable
                                                              provenance from different sources. Data acquisition must be
           The  working  group  data  and  AI  solution  handling  takes   transparently  documented  in full  detail  [cf.  68],  including
           charge for a range of tasks related to conducting the tests,   annotation guidelines, for reproducibility, replicability, and
           which requires bringing the test data and the to-be-tested AI   scalability.  All  ethical  and  legal  questions  related  to  the
           solutions  together.  Relevant  aspects  include,  e.g.  transfer   acquisition, storage and processing of health data must be
           agreements, secure data and solution transfer, data checks,   taken into careful consideration. Bias must be controlled and
           IT infrastructure, access rights, traceability, IT security, test   documented clearly. The document shall specify quality and
           implementation and report generation.              quantity criteria for the test data, including corresponding
                                                              references. The annotation needs to be conducted by experts
           The working group for regulatory considerations is involved   with  a  defined  level  of  expertise,  with  potentially  several
           in the entire process, with representatives of FDA (USA),   independent  annotations  per  sample  (if  applicable).
           CMDE/  NMPA  (China),  CDSCO  (India),  EMA  (Europe)   Technical  matters,  e.g.  data  formats  [cf.  69,  70]  and  data
           and BfArM (Germany) so far. In close collaboration with the   management [71], need to be specified. A reference model
           WHO, the working group facilitates subsequent steps (e.g.   can potentially be defined (e.g. “average human performance
           AI  testing  process  specification,  clinical  evaluation,   for  this  task”,  “best  in  class”).  Limiting  factors  for  data
           certification  etc.)  towards  deployment  of  the  health  AI   availability should be referred to, such as finances or time.
           solution in practice.                              The plan detailed in the topic description document must be
                                                              implemented in practice. The test data must be provided or
           The  topic  groups,  TGs,  take  charge  of  specific  health   acquired,  and  measures  for  quality  assurance  taken.  The
           domains  with  corresponding  ML/AI  tasks.  They  are   evaluation  routine  must  be  implemented,  and  the  code
           providing  the  connection  of  the  WGs  with  actual  health   published  together  with  at  least  a  few  example  data  with





                                                           – 71 –
   86   87   88   89   90   91   92   93   94   95   96