Page 29 - FIGI - Big data, machine learning, consumer protection and privacy
P. 29

intentionally used in order to produce discriminatory   Addressing discrimination tendencies
            results.                                           One approach to address machine learning’s poten-
               Techniques for removing bias based on a protect-  tial tendency towards discrimination is to incor-
            ed attribute focus on ensuring that an individual’s   porate randomness into the data.  For instance,
                                                                                              138
            predicted label is independent of their protected   a machine learning algorithm for extending credit
            attributes.  However, even if protected attributes   may be trained using initial data that indicates that a
                     131
            are not explicitly included, correlated attributes   certain group (e.g., from a particular postcode or of a
            (proxies) may be included in the data set, resulting   particular gender or race) tends to have less reliable
            in outcomes that may be discriminatory. Addressing   debtors. If the model were to extend credit to other
            this in machine learning is challenging, but tests have   groups, then a self-fulfilling prophecy may result
            been developed to assess the impact of an automat-  whereby the characteristics of successful debtors
            ed decision on different protected groups. 132     correlate with non-membership of the protected
               In some countries, where bias is unintentional,   group. Incorporating an element of randomness into
            it may nevertheless be unlawful if it has “disparate   the model so that some individuals who would not
            impact,” which arises where the outcomes from a    ordinarily be predicted to be reliable debtors never-
            selection process are widely different for a protect-  theless receive credit could allow the model to test
            ed class of persons (e.g., by gender, race or ethnicity   the validity of the initial assumptions. The introduc-
            or religion) compared with other groups despite the   tion of data that evolves to be closer to the real world
            process appearing to be neutral. The notion of dispa-  may lead to improvements in the overall fairness and
            rate impact was developed from a US Supreme Court   accuracy of the system.
            decision in 1971   which found that certain intelli-  Another suggested approach is to select or modi-
                           133
            gence test scores and high school diplomas were    fy input data so that the output meets a fairness test
            largely correlated with race to render discriminatory   operated by the system. Additional training samples
            hiring decisions.  The legal theory was recently reaf-  from a minority group might be selected in order to
                          134
            firmed when in 2015 the US Supreme Court held that   avoid the model over-reflecting its minority status.
            a plaintiff may establish a prima facie case against   There are other methods for ensuring statistical par-
            discrimination under the Fair Housing Act without   ity  among  groups  that  can  be  adopted,   and  the
                                                                                                   139
            evidence that it was intentional if they bring statisti-  important thing is to ensure that these are designed
            cal proof that a governmental policy causes a dispa-  into the model, even using artificial intelligence to
            rate impact. 135                                   monitor artificial intelligence.
               The involvement of computers makes it more dif-   In some cases, one might expect there to be a
            ficult to determine disparate impact, and thus bias.   commercial incentive to remove bias. Bias is not only
            Disclosing and explaining the process of selection by   harmful to a service’s reputation, but it may be sub-
            algorithm may be difficult or effectively impossible.   optimal business economics for the service provider.
            Nevertheless, where it can be shown that a model   If an applicant’s postcode leads to a lower score and
            produces discriminatory results, it may be possi-  rejection of their loan application despite the appli-
            ble that it violates laws prohibiting discrimination,   cant having a healthy income, low level of indebted-
            although proving this may be difficult, and justifica-  ness and other positive attributes, then the lender
            tions such as business necessity may also apply. 136  has missed an opportunity to make a profitable loan.
               Discriminatory selection could occur without      In a perfect static market where providers com-
            involving protected groups. For instance, where dig-  pete on the same service and may refine it to increase
            ital financial services algorithms infer from user data   market share, one might expect designers to improve
            that an individual is experiencing financial liquidity   algorithms over time to weed out bias. However, in a
            problems, payday lenders may be able to target vul-  dynamic market where new models and services are
            nerable individuals with advertisements and offers   constantly being developed with new data constant-
            for loans at high interest rates and charges. Com-  ly being added, bias may be addressed only for the
            petition  from  firms  like  ZestFinance  may  actually   model to be updated or replaced by a new one that
            drive down the cost of lending to such groups, but   may reflect new bias, renewing the problem. Busi-
            concerns may arise if discriminatory selection has   nesses may also focus more on rapid growth to win
            adverse results for an individual. 137             the new market, while viewing discriminatory impact
                                                               on protected groups as a lower level priority. Even
                                                               if the market might be expected over time to refine
                                                               algorithms to reduce bias, in many cases it is simply



                                                             Big data, machine learning, consumer protection and privacy  27
   24   25   26   27   28   29   30   31   32   33   34