Page 44 - Kaleidoscope Academic Conference Proceedings 2020
P. 44
encryption and decryption services and homomorphyic features that prevent us from being fooled. These include the
computing methods. A Rater is important to quantify the fact that the stop sign is octagonal, that it is red, that it has
accuracy or veracity of the information stored by the “Data the letters STOP on it, that it is proximate to an intersection
Point”, and serves as a feedback loop that can confirm the or a crosswalk, and so on. These are all features that are also
validity of information, identify errors, and serve to revealed by a KG. The performance of an ML model that
systematically deal with ambiguities and variability. An explicitly uses these features is also much harder to fool. This
Editor is essential to maintain the quality and consistency of is an example of why the combination of how we treat our
the “Data Points” and to enforce rules or guidelines for style, data, the tools that are available to analyze it and how ML
support a common dictionary of terms, and similar items. It models are formulated are much more powerful together.
serves the role of providing feedback to the originators of This is an area of research that deserves considerably more
data, and is a primary mechanism for making sure that the attention.
information is reusable and interoperable with that from
other originators. The Publisher is a system that then An early version of the scheme described above was
oversees the release of data content within the enterprise and successfully implemented at a number of large financial
where applicable a larger ecosystem. The content can then institutions. This included the first full “Data Point”
be discovered, where needed and authorised, and delivered approach across an enterprise. It also included the creation
through either pull or push mechanisms. of enterprise Knowledge Graphs (KG) in two organizations.
The work was led by Jacobus Geluk, who is currently the
The “Data Points” form the underlayer that allows the set of
tools, already matured and tested over the last decade, to
extract value from information and to orchestrate and
execute the complex processes that better align decisions and
actions across an enterprise. The tools are part of the suite
produced by W3C and their derivatives (RDF, TripleStore,
OWL, etc). A key function of the tools is the ability to form
Knowledge Graphs (KGs), that is to explicitly reveal the
linkages and dependencies among the “Data Points”. This is
where the touch point with AI and ML happens and makes a
tremendous difference. There are several aspects to this. The
first is that data distributed across an enterprise and its
ecosystem can be collected in a triple store and made
available for analysis, with the proper authorization and with
the proper protection. Included here is the ability to perform
computations through shared secret or homomprphic
computing methods without revealing the underlying
protected data. The second is that the linkages found from
the KGs that in turn expose the common features found in
the data explicitly. This then automates both the labeling of
data sets and incorporation of features in AI/ML algorithms.
While what has been described does not lead to “General AI”,
the hypothesis is that it nevertheless reduces the number of
misfires that sometimes occur with ML algorithms returning
surprisingly incorrect results.
I would like to conclude with an example of how this may
work. A common example cited by researchers is the
behavior of ML algorithms used to identify a “Stop” sign in
a scene for controlling an autonomous vehicle. The ML Figure 9 - Data Point Protocol and its ecosystem
algorithm is fed labeled pictures of “Stop” signs under
various conditions and eventually does a very very good job CTO of Agnos.Ai. The “Data Point” implementation was
of correctly identifying the presence of a stop sign. What built out at BNY Mellon and has been in place since 2017.
researchers have noticed is that a small overlay or smear on Jacobus also led the architecture teams that created the KGs
the stop sign can throw the algorithm off completely, so it no at BNY Mellon and at Bloomberg.
longer acknowledges the presence of a stop sign. They have
also noticed that introducing a more or less random pattern 4.3 Technologies in Focus – The Importance of
in a pixilated image can produce a similar effect. As human Standards
observers there is an element of common sense that would
not have thrown us off the same way. When we see a stop For wide acceptance of AI methods and data approaches
sign and the context that it is associated with we see other standards are crucial if they are to be the building blocks of
– xl –