Page 34 - FIGI - Big data, machine learning, consumer protection and privacy
P. 34
the user’s device before Apple anonymises the user of sources of data, as well as methods of collection
data (dropping IP addresses and other metadata) and data formats.
and collects, aggregates and analyzes it. “Both the Various risks to the consumer arise with transfer
ingestion and aggregation stages are performed in a of personal data. Transfer of data from one entity to
restricted access environment so even the privatized another increases risk of breach due to the higher
data isn’t broadly accessible to Apple employees.” 155 number of parties holding it, as well as from vulnera-
In addition to these sorts of measures, a policy of bilities of the transfer process itself. Sensitive, confi-
“separation of duties” can reduce privacy risks in pro- dential data may be obtained by third parties without
cessing personal data. This limits any single adminis- permission, risking identity theft, intrusive marketing
trator’s power to a given role, with other roles man- and other privacy violations.
aged by other administrators similarly limited, thus The very transfer of data to a third party may itself
reducing the risk of a rogue administrator. Linked to be something that the consumer might not have
this, a policy of “least privilege” would aim to ensure expected when originally sharing their data with a
that each administrator will only have the powers company, for example when accessing its service or
necessary for their delegated function. when merely browsing the internet. Lastly, the pro-
Ultimately, the difficulty of preventing re-identi- liferation of data about a person may increase the
fication may mean that a black-and-white view on asymmetry of bargaining power between consumers
de-identification may not be helpful, and the debate and the firms selling them products and services, as
over the efficacy of these techniques may need to be discussed in section 4.4.
looked at “in a more nuanced way, accepting that in The transfer of data from one entity to another
some, but not all cases, de-identification might pro- means that an organization processing the data will
vide acceptable answers.” Indeed, Cynthia Dwork often have no direct relationship with the original
156
suggests that continuous use of accurate data will entity that collected it, and indeed, it may be at sev-
eventually undermine privacy and the techniques eral levels of remove. The acquiring entity may lack
mitigate rather than eliminate risk: information about whether the data was collected
157
[D]ata utility will eventually be consumed: the and is transferred in compliance with data protection
Fundamental Law of Information Recovery states and privacy laws.
that overly accurate answers to too many questions Where data is obtained with user consent (e.g.,
will destroy privacy in a spectacular way. The goal credit card use data, financial transaction data, email
of algorithmic research on differential privacy is to data), the key question will be whether consent was
postpone this inevitability as long as possible. validly obtained. For data obtained from public spac-
In this light, regulation could seek to rely less on es (e.g., satellite insights data, drone data, surveil-
notification to consumers that their data will be col- lance footage, dropcam data), the key question will
lected, analyzed and shared, and on obtaining their be whether the data was really obtained from public
consent to this, and more on ensuring that privacy spaces, and in a manner consistent with surveillance
enhancing technologies are continuously integrated laws. Where data was obtained from the internet
into big data and machine learning data process- without express user consent (web scraping, docu-
ing and updated to deal with evolving challenges. mented and undocumented APIs), the issue will be
Achieving this may require establishing incentives whether the data was obtained through authorized
in legislation that create liability for data breaches, access. Certification approaches may emerge where-
essentially placing less of the economic burden on by data may be guaranteed to have been subject to
the consumer by obtaining their consent and more de-identification, pseudonymization and anonymiza-
on the organizations collecting, using and sharing tion before it is traded.
the data. Currently the market in data is very fluid. Firms
buy and sell data, and reduce their risk of liability and
4�4 Protecting consumers in relation to the circula- thus economic burden associated with data privacy,
tion of personal data about them by obtaining contractual representations and war-
Big data and machine learning are made possible ranties about compliance with privacy laws, such as
not only by supply of data from online activity and whether any necessary user consent was obtained.
demand from service providers that rely on it, but Companies such as ZwillGen will advise firms rely-
158
by intermediaries – the third-party data brokers who ing on big data how to manage their economic risks
trade in personal data. This results in a huge number arising from privacy law liability.
32 Big data, machine learning, consumer protection and privacy