Page 15 - FIGI - Big data, machine learning, consumer protection and privacy
P. 15

velocity (the “three Vs”  – sometimes expanded to    Many countries’ telecommunications laws and
                                 28
            four Vs by the addition of “veracity”).  The advent   licences include clauses expressly prohibiting licens-
                                              29
            of big data techniques arises from developments    ees from using, disclosing or recording any communi-
            in how data is collected, stored and used. Data is   cation or content sent using an electronic communi-
            collected using numerous applications and sensors   cation service or information relating to such services
            which record consumers’ communications, transac-   provided to others. This is increasingly extended to
            tions and movements. Distributed databases store   metadata. For example, the EU’s ePrivacy Directive is
            the data, and high-speed communications transmit   being replaced with the ePrivacy Regulation, which
            it at high speed, reducing the cost of data analytics.   fleshes out data protection themes of the GDPR fur-
            Advanced analytical processes are applied in numer-  ther specifically for electronic communications ser-
            ous contexts.                                      vices, addressing both personal data and metadata,
                                                               such as call detail records (CDRs).  However, this is
                                                                                             31
            2�2  What kind of data is used?                    not universal, and many countries do not prevent use
            In the financial services context, historically, data   of metadata. Even where it is prohibited, it may be
            used for decision making might have included formal   permitted with the customer’s consent, enabling the
            representations by an applicant for a service, some   operator to generate credit scores that may be used
            personal knowledge  by the local bank manager or   to extend digital loans.
            insurance broker, and a broader range of organized
            data held, analyzed and profiled through credit refer-  Mobile money and other payment data
            ence bureaus. Today, big data includes  alternative   Telecommunications companies may also hold data
            data, i.e., data that is not collected and documented   about the use of related services that are carried
            pursuant to traditional credit reporting but from a   over telecommunications networks. For instance,
            wide range of other digital sources.               many mobile network operators provide a propri-
                                                               etary mobile payment service to their customers. As
            Telecommunications data                            a result, they have access to data about when, how
            An important source of alternative data being used to   regularly and by how much a person tops up his or
            extend financial services is derived from telecommu-  her mobile money wallet, the average balance he or
            nications network operators’ services. Telecommuni-  she maintains, who he or she makes payments to or
            cations companies are typically constrained in their   receives payments from and the amounts of such
            ability to collect and use data about their custom-  payments. By analysing the regularity, amounts and
            ers, particularly the content of their telephone calls.   recipients (e.g., family, utility invoices or school fees)
            These have been protected by legislation on lawful   involved,  data  analytics  can  form  a  picture  of  the
            interception with themes similar to the laws protect-  scale and reliability of a person’s cash flows (both
            ing postal communications that prohibited the open-  income and expenditures), his or her social network,
            ing of envelopes without a lawful basis. However,   and ultimately enable assessment of creditworthi-
            while telecommunications companies may not use     ness. Regular payments of utility bills or school fees
            the content of their customers’ communications,    may indicate a regular cash flow and generally posi-
            they also have access to (and are often required by   tive approach to payment of debts.
            regulation to retain ) metadata.                     Access to such data is proving to be a useful
                             30
               Metadata are data about the customer’s use of   means of introducing people hitherto excluded from
            their communications services, including who com-  financial services – due to lack of information about
            municated  with  whom  at  what  time,  for  how  long,   them – to digital financial services. Mobile network
            and the location from where the call was made, the   operators have in many cases partnered with banks
            combination of which can help profile an individu-  to facilitate mobile lending using credit scores devel-
            al’s relationships and cash flows. Regular topping up   oped using  the  mobile  network  operator’s  data
            of prepaid phone credit may imply a stable income.   about the  customer. The  operator  might  not  share
            Calls to and from abroad may imply access to an    the raw mobile money data or call metadata with the
            international network, and potentially greater afflu-  banks, but will often apply algorithms to it to pro-
            ence. Regular calls during the working day in a dense   duce a credit score.
            urban area may imply a steady job, and calls made    To take one example , one mobile network oper-
                                                                                   32
            or received at the same location in the evenings may   ator uses 48 parameters over a 6-month period and
            indicate the location of the individual’s home, and so   information collected in the individual’s registration
            economic or social class.                          (KYC) process to produce a scorecard and buckets



                                                             Big data, machine learning, consumer protection and privacy  13
   10   11   12   13   14   15   16   17   18   19   20