Page 8 - FIGI - Big data, machine learning, consumer protection and privacy
P. 8
Executive Summary
This paper explores various challenges that consumer ing the volumes of data collection over time. As stat-
protection and data privacy law and regulation face ed in a 2014 report to the US President in 2014, “The
with regard to big data and machine learning tech- notice and consent is defeated by exactly the posi-
niques, particularly where these are used for making tive benefits that big data enables: new, non-obvious,
decisions about services provided to consumers. unexpectedly powerful uses of data.”
The beneficial opportunity data presents for Some suggest privacy expectations are high-
development is widely recognized, particularly for ly contextual. Tighter restrictions on collection, use
the provision of digital financial services. Service pro- and sharing of personal data in some situations (and
viders can use big data to build a detailed personal tiered consent which differentiates between types
profile of an individual including his or her behaviour of data according to use or the organization that
(e.g., preferences, activities and movements) which may use it) have been discussed. Sunset clauses
may be used for commercial offers. Big data and providing that the individual’s consent to use his or
machine learning are being increasingly deployed her personal data will expire after a period of time
for financial inclusion, not only in wealthy nations but (and potentially renewed) have also been suggest-
also in developing countries. These new technologies ed. Efforts are also being made to develop technol-
also bring risks, some say tendencies, of bias in deci- ogies and services to manage consent better. There
sion-making, discrimination and invasion of privacy. appears to be a genuine commercial opportunity for
Artificial intelligence involves techniques that seek investment and innovation to improve management
to approximate aspects of human or animal cogni- of such consumer consent.
tion using computing machines. Machine learning The successful functioning of machine learning
refers to the ability of a system to improve its perfor- models and the accuracy of their outputs depends
mance, by recognising patterns in large datasets. Big on the quality of the input data. Data protection and
data relies upon and is typically defined by, comput- privacy laws increasingly impose legal responsibility
er processing involving high volumes and varieties of on firms to ensure the accuracy of the data they hold
types of linked up data processed at high velocity and process. However, they do not legislate for accu-
(the “three Vs” – sometimes expanded to four Vs by racy of output from big data and machine learning
the addition of “veracity”). systems. This raises questions about the regulatory
Consumer protection involves the intervention of responsibilities of those handling big data, concern-
the State through laws and processes in what would ing both the accuracy of input data in automated
otherwise be a private relationship between consum- decisions and the data reported in formal credit data
er and provider. It aims to compensate for perceived reporting systems. In some jurisdictions, this has giv-
information, bargaining and resource asymmetries en rise, among other remedies, to certain rights to
between providers and consumers. object to automated decisions.
Increasingly, countries are legislating to protect Inferences from input data generated by machine
the personal data and privacy of their subjects, learning models determine how individuals are
granting them rights that give them more power viewed and evaluated for automated decisions. Data
over how their personal data is used. These laws protection and privacy laws may be insufficient to
are under strain in an era of big data and machine deal with the outputs of machine learning models
learning. Complying with requirements to notify the that process such data. One of their concerns is to
consumer as to the purpose of data collection is dif- prevent discrimination, typically protecting special
ficult where, as in machine learning, the purpose may categories of groups (e.g., race, ethnicity, religion,
not be known at time of notification. Consent is dif- gender). In the era of big data, however, non-sensi-
ficult to obtain when the complexity of big data and tive data can be used to infer sensitive data.
machine learning systems is beyond the consumer’s Machine learning may lead to discriminatory
comprehension. The notion of data minimization results where the algorithms’ training relies on his-
(collecting and storing only data necessary for the torical examples that reflect past discrimination, or
purpose for which it was collected, storing it for the the model fails to consider a wide enough set of fac-
minimum period of time) runs counter to the modus tors. Addressing bias is challenging, but tests have
operandi of the industry, which emphasizes maximiz- been developed to assess where it may arise. In some
6 Big data, machine learning, consumer protection and privacy