Page 34 - FIGI - Big data, machine learning, consumer protection and privacy
P. 34

the user’s device before Apple anonymises the user   of sources of data, as well as methods of collection
            data (dropping IP addresses and other metadata)    and data formats.
            and collects, aggregates and analyzes it. “Both the   Various risks to the consumer arise with transfer
            ingestion and aggregation stages are performed in a   of personal data. Transfer of data from one entity to
            restricted access environment so even the privatized   another  increases  risk  of  breach  due  to  the  higher
            data isn’t broadly accessible to Apple employees.” 155  number of parties holding it, as well as from vulnera-
               In addition to these sorts of measures, a policy of   bilities of the transfer process itself. Sensitive, confi-
            “separation of duties” can reduce privacy risks in pro-  dential data may be obtained by third parties without
            cessing personal data. This limits any single adminis-  permission, risking identity theft, intrusive marketing
            trator’s power to a given role, with other roles man-  and other privacy violations.
            aged by other administrators similarly limited, thus   The very transfer of data to a third party may itself
            reducing the risk of a rogue administrator. Linked to   be something that the consumer might not have
            this, a policy of “least privilege” would aim to ensure   expected when originally sharing their data with a
            that each administrator will only have the powers   company, for example when accessing its service or
            necessary for their delegated function.            when merely browsing the internet. Lastly, the pro-
               Ultimately, the difficulty of preventing re-identi-  liferation of data about a person may increase the
            fication may mean that a black-and-white view on   asymmetry of bargaining power between consumers
            de-identification may not be helpful, and the debate   and the firms selling them products and services, as
            over the efficacy of these techniques may need to be   discussed in section 4.4.
            looked at “in a more nuanced way, accepting that in   The transfer of data from one entity to another
            some, but not all cases, de-identification might pro-  means that an organization processing the data will
            vide acceptable answers.”  Indeed, Cynthia Dwork   often have no direct relationship with the original
                                   156
            suggests that continuous use of accurate data will   entity that collected it, and indeed, it may be at sev-
            eventually undermine privacy and the techniques    eral levels of remove. The acquiring entity may lack
            mitigate rather than eliminate risk:               information about whether the data was collected
                                          157
               [D]ata utility will eventually be consumed: the   and is transferred in compliance with data protection
               Fundamental Law of Information Recovery states   and privacy laws.
               that overly accurate answers to too many questions   Where data is obtained with user consent (e.g.,
               will destroy privacy in a spectacular way. The goal   credit card use data, financial transaction data, email
               of algorithmic research on differential privacy is to   data), the key question will be whether consent was
               postpone this inevitability as long as possible.  validly obtained. For data obtained from public spac-
               In this light, regulation could seek to rely less on   es (e.g., satellite insights data, drone data, surveil-
            notification to consumers that their data will be col-  lance footage, dropcam data), the key question will
            lected, analyzed and shared, and on obtaining their   be whether the data was really obtained from public
            consent to this, and more on ensuring that privacy   spaces, and in a manner consistent with surveillance
            enhancing technologies are continuously integrated   laws. Where data was obtained from the internet
            into  big  data  and  machine  learning  data  process-  without express user consent (web scraping, docu-
            ing and updated to deal with evolving challenges.   mented and undocumented APIs), the issue will be
            Achieving this may require establishing incentives   whether the data was obtained through authorized
            in legislation that create liability for data breaches,   access. Certification approaches may emerge where-
            essentially placing less of the economic burden on   by data may be guaranteed to have been subject to
            the consumer by obtaining their consent and more   de-identification, pseudonymization and anonymiza-
            on the organizations collecting, using and sharing   tion before it is traded.
            the data.                                            Currently  the  market  in  data  is  very  fluid.  Firms
                                                               buy and sell data, and reduce their risk of liability and
            4�4  Protecting consumers in relation to the circula-  thus economic burden associated with data privacy,
            tion of personal data about them                   by obtaining contractual representations and war-
            Big data and machine learning are made possible    ranties about compliance with privacy laws, such as
            not only by supply of data from online activity and   whether any necessary user consent was obtained.
            demand from service providers that rely on it, but   Companies such as ZwillGen  will advise firms rely-
                                                                                        158
            by intermediaries – the third-party data brokers who   ing on big data how to manage their economic risks
            trade in personal data. This results in a huge number   arising from privacy law liability.





           32    Big data, machine learning, consumer protection and privacy
   29   30   31   32   33   34   35   36   37   38   39