Page 37 - FIGI - Big data, machine learning, consumer protection and privacy
P. 37

However, two problems arise in providing an     able when the rules or their results fail to meet policy
            explanation to the consumer in the context of big   goals, or even fall foul of laws, particularly relating to
            data and machine learning:                         discrimination. Indeed, not only do ordinary people
               First, the techniques are hard to explain, par-  not understand machine learning models, but even
            ticularly in plain language to consumers. Machine   those who develop them are often unable to explain
            learning models are described as “opaque”  and as   why they succeed.
                                                  176
            “black boxes.”  Even providing source code will not   However, in many sectors, it is not workable for
                         177
            inform even the computer scientists how a decision   machine learning models to be understood only
            was made, as “[m]achine learning is the science of   by  data  scientists  and  computer  programmers.  In
            getting computers to act without being explicitly   medicine, banking, insurance and other sectors,
            programmed.”                                       researchers and even practitioners must understand
                         178
               Second, to some degree, the machine learning    the machine learning models they rely on if they are
            models  are  the  subject  of  trade  secrets  and  soft-  to trust them and their results. Trade-offs may arise
            ware copyright that are the result of investment and   between keeping models and modelling processes
            exist in a competitive commercial market. A machine   transparent and interpretable (which requires min-
            learning operator may be reluctant to share the cod-  imising complexity) and developing machine learn-
            ing of or an explanation for the machine learning   ing models that evolve over time to improve their
            algorithm lest this weaken competitive opportunity   accuracy and performance (which makes them more
            and undermine the initial investment.              complex and harder to explain).
               These factors present important challenges for    Furthermore, the accuracy of machine learning
            accountability to consumers for the use of algo-   depends on how data used for training and validation
            rithms.  In particular, the difficulty of explaining to   of machine learning models is selected and curated.
                  179
            a consumer the relationship between data inputs    It also depends on articulating properly the task of
            and outputs is a barrier to the consumer challeng-  the model, allowing for well-developed hypotheses,
            ing decisions made about them.  Nevertheless, even   and selecting relevant metrics for performance. Ulti-
            if explanations are currently difficult to generate, it   mately, given enough time and resources, a comput-
            may be that only if such legal rights are created will   er programme should be explainable, or otherwise
            the necessary efforts be made.                     there  can  be  no  reason  to  have  confidence  in  the
               There may be important reasons to make such     accuracy of its conclusions. 181
            efforts.  Society-wide  acceptance  of  big  data  and   While some suggest that complexity defies expla-
            machine learning, particularly  automated deci-    nation, others argue that such a view conceals the
            sion-making and the services that rely on it, will   ready understandability of algorithms, and that “rath-
            depend at least in part on trust – trust that the rele-  er than discounting systems which cause bad out-
            vant information has been considered in a reasonable   comes as fundamentally inscrutable and therefore
            manner. It is a common perception that in machine   uncontrollable, we should simply label the applica-
            learning, correlation and prediction are the govern-  tion of inadequate technology what it is: malpractice,
            ing principles, and that causality and reasoning are   committed by a system’s controller.”  Still, there are
                                                                                               182
            unimportant. In 2008, Chris Anderson declared the   clearly challenges to providing explanations for auto-
            scientific method obsolete, overtaken by the corrob-  mated decisions that can be readily understood by
            orative power of mass correlations.  Machine learn-  inexpert humans.
                                           180
            ing identifies correlations between factors, which
            do not amount to causation. It may be able to make   Regulating for adequate explanations
            predictions for future behaviour, but not explain the   When a financial service provider makes a decision
            reasons.                                           based on data inputs (e.g., income and asset levels,
               Machine learning occurs where a computer sys-   post code), the decision is ultimately based on infer-
            tem is exposed to large quantities of data (from his-  ences made from these sources, such as whether
            torical examples), is trained to observe patterns in   the individual’s risk of default on a loan of a certain
            the data, and infers a rule from those patterns. Rath-  size over a certain period is too high to justify the
            er than establishing rules directly, humans generate a   loan. Typically, data protection laws do not provide
            computerised rule-making process. This abstraction,   protection against unreasonable inferences, leaving
            or disconnect, between the humans and the decision,   such matters to sector specific laws, if at all. Indeed,
            creates challenges for verifying the rules that are cre-  most data protection laws do not require the data
            ated. This makes it difficult to hold them account-  controller to provide an explanation for an automat-



                                                             Big data, machine learning, consumer protection and privacy  35
   32   33   34   35   36   37   38   39   40   41   42