Page 42 - FIGI - Big data, machine learning, consumer protection and privacy
P. 42

Ongoing monitoring, improvement and accountabil-   crimination, such as gender, race or postal code. This
            ity of machine learning systems depends on docu-   requires guidance from lawyers regarding the types
            menting these objectives.                          of features that would be an unlawful basis for dis-
                                   215
               Risk management may apply to both input and     crimination. Constant monitoring through statistical
            output data in machine learning models:  216       representation of output data should also improve
               On the  input data side, risk mitigation will start   detection of anomalies, feedback loops and other
            with documenting the requirements of the model     misbehaviour. Again, documenting these and ongo-
            (e.g., data-freshness, features and uses), the degree   ing testing will improve and widen understanding of
            of dependence on data from surrounding systems,    a model’s risks.
            why and how personal data is included and how it is   Risk assessment extends both to the input and
            protected (e.g., encryption or otherwise), as well as   output  data,  and  to  the  creation  and  operation  of
            its traceability. Such documentation supports effec-  algorithms. The research institute AINow  has pro-
                                                                                                   217
            tive review and maintenance. It will include assessing   posed  that  public  agencies  carry  out “algorithmic
            the “completeness, accuracy, consistency, timeliness,   impact assessments”, including in procurement of
            duplication, validity, availability, and provenance” of   data and software, and in the operation of automat-
            the input data. Mechanisms to ensure the model may   ed decision-making processes, as part of a wider set
            be tested, updated and monitored over time may     of accountability measures.
                                                                                       218
            also be important.                                   Altogether, data processors need to define
               On the output data side, various processes may   intended outcomes as well as unintended outcomes
            be instituted to reduce risk of machine learning   that should be avoided (working with legal and com-
            models producing adverse results. Bias detection   pliance teams), and be ready to correct or pull the
            mechanisms can be instilled to ensure that popula-  model out of usage. If outputs risk breaching con-
            tion groups are not discriminated against, or at least   sumer protection, data privacy, antidiscrimination
            bias is quantified and minimised. Sometimes it may   or other laws, firms should be ready with a strate-
            be necessary to restrict certain types of data in the   gy for dealing with authorities. For instance, Califor-
            model. Output data can also be analyzed to detect   nia’s guidance on permits for autonomous vehicles
            proxies for features that might be a basis for dis-  has specific provisions addressing how a firm should




                Monetary Authority of Singapore’s FEAT Principles
                4. AIDA-driven decisions are regularly reviewed so that models behave as designed and intended.
                5. Use of AIDA is aligned with the firm’s ethical standards, values and codes of conduct.
                6. AIDA-driven decisions are held to at least the same ethical standards as human-driven decisions.

                Smart Campaign’s draft Digital Credit Standards
                Indicator 2�1�3�0
                If the repayment capacity analysis is automated (e.g., through the use of an algorithm), the effective-
                ness of the system in predicting the client repayment capacity is reviewed by a unit of the organiza-
                tion independent from the algorithm development team (e.g. internal audit, senior management, or
                other department). The review provides recommendations to improve the algorithm outcomes that
                are promptly implemented.
                Indicator 2�1�10�0
                The provider has a rigorous internal control process to verify the uniform application of policies and
                procedures around credit underwriting. This applies both to cases where staff is involved or when the
                process is automated.
                Indicator 2�1�10�1
                The rationale for an algorithm is documented including the factors/types of variables used and justi-
                fication for relying on those factors. An independent unit within the organization periodically reviews
                alignment and compliance between rationale, the algorithm, and its outputs. There is documented
                evidence of tests run and corrective actions taken.






           40    Big data, machine learning, consumer protection and privacy
   37   38   39   40   41   42   43   44   45   46   47