Page 44 - Kaleidoscope Academic Conference Proceedings 2020
P. 44

encryption  and  decryption  services  and  homomorphyic   features that prevent us from being fooled. These include the
           computing  methods.  A  Rater is important  to  quantify the   fact that the stop sign is octagonal, that it is red, that it has
           accuracy or veracity of the information stored by the “Data   the letters STOP on it, that it is proximate to an intersection
           Point”, and serves as a feedback loop that can confirm the   or a crosswalk, and so on. These are all features that are also
           validity  of  information,  identify  errors,  and  serve  to   revealed by a KG. The performance of an ML model that
           systematically  deal  with  ambiguities  and  variability.  An   explicitly uses these features is also much harder to fool. This
           Editor is essential to maintain the quality and consistency of   is an example of why the combination of how we treat our
           the “Data Points” and to enforce rules or guidelines for style,   data, the tools that are available to analyze it and how ML
           support a common dictionary of terms, and similar items. It   models  are  formulated  are  much  more  powerful  together.
           serves the role of providing feedback to the originators of   This is an area of research that deserves considerably more
           data, and is a primary mechanism for making sure that the   attention.
           information  is  reusable  and  interoperable  with  that  from
           other  originators.  The  Publisher  is  a  system  that  then   An  early  version  of  the  scheme  described  above  was
           oversees the release of data content within the enterprise and   successfully  implemented  at  a  number  of  large  financial
           where applicable a larger ecosystem. The content can then   institutions.  This  included  the  first  full  “Data  Point”
           be discovered, where needed and authorised, and delivered   approach across an enterprise. It also included the creation
           through either pull or push mechanisms.            of enterprise Knowledge Graphs (KG) in two organizations.
                                                              The work was led by Jacobus Geluk, who is currently the
           The “Data Points” form the underlayer that allows the set of
           tools,  already  matured  and  tested  over  the  last  decade,  to
           extract  value  from  information  and  to  orchestrate  and
           execute the complex processes that better align decisions and
           actions across an enterprise. The tools are part of the suite
           produced by W3C and their derivatives (RDF, TripleStore,
           OWL, etc). A key function of the tools is the ability to form
           Knowledge  Graphs  (KGs),  that  is  to  explicitly  reveal  the
           linkages and dependencies among the “Data Points”. This is
           where the touch point with AI and ML happens and makes a
           tremendous difference. There are several aspects to this. The
           first  is  that  data  distributed  across  an  enterprise  and  its
           ecosystem  can  be  collected  in  a  triple  store  and  made
           available for analysis, with the proper authorization and with
           the proper protection. Included here is the ability to perform
           computations  through  shared  secret  or  homomprphic
           computing  methods  without  revealing  the  underlying
           protected data. The second is that the linkages found from
           the KGs that in turn expose the common features found in
           the data explicitly. This then automates both the labeling of
           data sets and incorporation of features in AI/ML algorithms.

           While what has been described does not lead to “General AI”,
           the hypothesis is that it nevertheless reduces the number of
           misfires that sometimes occur with ML algorithms returning
           surprisingly incorrect results.

           I would like to conclude with an example of how this may
           work.  A  common  example  cited  by  researchers  is  the
           behavior of ML algorithms used to identify a “Stop” sign in
           a  scene  for  controlling  an  autonomous  vehicle.  The  ML   Figure 9 - Data Point Protocol and its ecosystem
           algorithm  is  fed  labeled  pictures  of  “Stop”  signs  under
           various conditions and eventually does a very very good job   CTO  of  Agnos.Ai.  The  “Data  Point”  implementation  was
           of  correctly identifying  the presence  of  a  stop  sign. What   built out at BNY Mellon and has been in place since 2017.
           researchers have noticed is that a small overlay or smear on   Jacobus also led the architecture teams that created the KGs
           the stop sign can throw the algorithm off completely, so it no   at BNY Mellon and at Bloomberg.
           longer acknowledges the presence of a stop sign. They have
           also noticed that introducing a more or less random pattern   4.3   Technologies  in  Focus  –  The  Importance  of
           in a pixilated image can produce a similar effect. As human   Standards
           observers there is an element of common sense that would
           not have thrown us off the same way. When we see a stop   For  wide  acceptance  of  AI  methods  and  data  approaches
           sign and the context that it is associated with we see other   standards are crucial if they are to be the building blocks of





                                                           – xl –
   39   40   41   42   43   44   45   46   47   48   49