Page 29 - FIGI - Big data, machine learning, consumer protection and privacy
P. 29
intentionally used in order to produce discriminatory Addressing discrimination tendencies
results. One approach to address machine learning’s poten-
Techniques for removing bias based on a protect- tial tendency towards discrimination is to incor-
ed attribute focus on ensuring that an individual’s porate randomness into the data. For instance,
138
predicted label is independent of their protected a machine learning algorithm for extending credit
attributes. However, even if protected attributes may be trained using initial data that indicates that a
131
are not explicitly included, correlated attributes certain group (e.g., from a particular postcode or of a
(proxies) may be included in the data set, resulting particular gender or race) tends to have less reliable
in outcomes that may be discriminatory. Addressing debtors. If the model were to extend credit to other
this in machine learning is challenging, but tests have groups, then a self-fulfilling prophecy may result
been developed to assess the impact of an automat- whereby the characteristics of successful debtors
ed decision on different protected groups. 132 correlate with non-membership of the protected
In some countries, where bias is unintentional, group. Incorporating an element of randomness into
it may nevertheless be unlawful if it has “disparate the model so that some individuals who would not
impact,” which arises where the outcomes from a ordinarily be predicted to be reliable debtors never-
selection process are widely different for a protect- theless receive credit could allow the model to test
ed class of persons (e.g., by gender, race or ethnicity the validity of the initial assumptions. The introduc-
or religion) compared with other groups despite the tion of data that evolves to be closer to the real world
process appearing to be neutral. The notion of dispa- may lead to improvements in the overall fairness and
rate impact was developed from a US Supreme Court accuracy of the system.
decision in 1971 which found that certain intelli- Another suggested approach is to select or modi-
133
gence test scores and high school diplomas were fy input data so that the output meets a fairness test
largely correlated with race to render discriminatory operated by the system. Additional training samples
hiring decisions. The legal theory was recently reaf- from a minority group might be selected in order to
134
firmed when in 2015 the US Supreme Court held that avoid the model over-reflecting its minority status.
a plaintiff may establish a prima facie case against There are other methods for ensuring statistical par-
discrimination under the Fair Housing Act without ity among groups that can be adopted, and the
139
evidence that it was intentional if they bring statisti- important thing is to ensure that these are designed
cal proof that a governmental policy causes a dispa- into the model, even using artificial intelligence to
rate impact. 135 monitor artificial intelligence.
The involvement of computers makes it more dif- In some cases, one might expect there to be a
ficult to determine disparate impact, and thus bias. commercial incentive to remove bias. Bias is not only
Disclosing and explaining the process of selection by harmful to a service’s reputation, but it may be sub-
algorithm may be difficult or effectively impossible. optimal business economics for the service provider.
Nevertheless, where it can be shown that a model If an applicant’s postcode leads to a lower score and
produces discriminatory results, it may be possi- rejection of their loan application despite the appli-
ble that it violates laws prohibiting discrimination, cant having a healthy income, low level of indebted-
although proving this may be difficult, and justifica- ness and other positive attributes, then the lender
tions such as business necessity may also apply. 136 has missed an opportunity to make a profitable loan.
Discriminatory selection could occur without In a perfect static market where providers com-
involving protected groups. For instance, where dig- pete on the same service and may refine it to increase
ital financial services algorithms infer from user data market share, one might expect designers to improve
that an individual is experiencing financial liquidity algorithms over time to weed out bias. However, in a
problems, payday lenders may be able to target vul- dynamic market where new models and services are
nerable individuals with advertisements and offers constantly being developed with new data constant-
for loans at high interest rates and charges. Com- ly being added, bias may be addressed only for the
petition from firms like ZestFinance may actually model to be updated or replaced by a new one that
drive down the cost of lending to such groups, but may reflect new bias, renewing the problem. Busi-
concerns may arise if discriminatory selection has nesses may also focus more on rapid growth to win
adverse results for an individual. 137 the new market, while viewing discriminatory impact
on protected groups as a lower level priority. Even
if the market might be expected over time to refine
algorithms to reduce bias, in many cases it is simply
Big data, machine learning, consumer protection and privacy 27