Page 41 - Kaleidoscope Academic Conference Proceedings 2020
P. 41

•     The second is the sheer improvement in computing  circumstances?  It  also  begs  the  question  of  what  is
                 power available for the ML training process (which I  acceptable and what is not? If we were to perform a similar
                 won’t discuss here but for which there is plenty of  task as human beings as the ML model is being asked to do
                 literature,  an  example  being  [13]).  This  step  is  we might not do any better, so where do we set the bar?
                 extremely compute-intensive, and computing in the
                 cloud has greatly democratized access to the required  The last point I would like to bring up is that the main effort
                 resources. The “weights” determined by the training  in using ML is not what goes into the ML model algorithms
                 process are subsequently used by an inference engine  or into the computing. The overwhelming effort goes into the
                 to relate the pattern in an input to a specific identity,  data
                 result, or action.
           •     The  third  is  the  inference  engines  (IE)  that  take
                 advantage of the asymmetry between the training of
                 an  ML  model and  its execution.  Such  an inference
                 engine can run 4-6 orders of magnitude faster than the
                 time  required  for  training.  The  IE  can  be  purpose-
                 built  hardware  or  software  running  on  a  general-
                 purpose  computer.  An  IE  can  run  many  different
                 models and be used for many applications. This is the
                 device that uses the “weights”

           What goes along with the algorithmic aspect of ML are the
           necessary resources for computing, data storage, processes
           used, operational aspects, and the time scales associated with
           the resources. These are illustrated in Figure 3.    Figure 4 - A ML learning network with hidden layers
                                                              and its life cycle. Typically, the fraction is 50% - 80% as
                                                              shown in Figure 5. The approach in dealing with training is
                                                              to keep the model as simple as possible. As an example, if
                                                              we  want  the  model  to  recognize  a  good  weld  pool  from
                                                              images in a welding process the training data may have a
                                                              simple  annotation  without explicit  description  of the  weld
                                                              features; we leave that to the model to figure out. While I
                                                              suspect that adding such features as part of the training would
                                                              improve the results, such annotation is expensive to do and
                                                              just raises the data costs even further. If it were possible to
                                                              do such annotation automatically it might well be worth it;
                                                              this is an area where there is a lot of experimentation yet to
                                                              be done.
           Figure 3 - Flows and processes for AI/ML applications

           The allure of ML is that it does not require the hard work
           needed  by  traditional  methods  and  the  specialized
           knowledge that went into developing models or simulations
           that are based on  the basic  laws  of  physics,  chemistry,  or
           mechanics. As shown in Figure 4, what ML algorithms learn
           is whether the input pattern is correctly identified with an
           output.  The  training  consists  of  input  patterns  used  to  set
           weights within the network so that the outputs are clustered
           under a correct label. The best success has been with images.
           An example would be images of highway signs – so the input
           could  be  pictures  of  “Stop  Signs”,  or  “Yield  Signs”,  or
           “Pedestrians  Crossing  Signs”  taken  under  as  many
           conditions as possible. Success would be correctly assigning
           the right label to each sign. There are, however, occasional   Figure 5 - Typical allocation of effort from “Towards Data
           surprises  and  inexplicably  an  ML  model  will  produce  an   Science” - https://towardsdatascience.com/
           unexpected and incorrect  result, even  though  this  happens
           rarely. The root causes for such errors are in general not well
           understood.  This  poses  a  natural  question  about  setting
           metrics and somehow determining error bounds to anticipate
           what  level  of  incorrect  results  is  likely  and  under  what





                                                           – xxxvii –
   36   37   38   39   40   41   42   43   44   45   46