Page 21 - Methodology for inter-operator and cross-border P2P money transfers
P. 21

Figure 5 – Symbolic overview of data structure for post-processing































            The lists are described in detail in sections Local   Examples for data which may need to be cleaned out
            log sheets, Team Assignment List (TAL) and Device   are:
            Assignment List (DAL).
                                                               •  Data resulting from test runs which have not been
            11�2  Data Structure Overview                        done under defined conditions.
            The overall data structure is shown in Figure 5. It is   •  Data taken in situations which are deemed to be
            assumed that this data structure will exist in a central   exceptional and should not be part of statistics.
            data processing environment, typically a SQL data-  •  Data resulting from unintended operation, e.g. a
            base.                                                wrong PIN, or from an operation cancelled due to
            It needs to pointed out that the methodology does    some other wrong entry.
            not provide a single, prescribed data structure. Actu-
            al data structures can have additional members and   Pre-cleaned data is imported to the database,
            shapes. Also, there is no absolute way to process   checked, corrected in case errors are detected, and
            data; the methodology can therefore be embedded    imported again until the  desired state is reached.
            in a wide range of post-processing environments and   During this  process,  individual data sets may be
            tool chains. This applies, in particular, to background   “masked  out”,  i.e.,  tagged  as  to  be  ignored  during
            testing (network KPI) data which can be created in   further processing steps. This is necessary if informa-
            multiple ways.                                     tion is incomplete or contradictory due to missing or
            Data will be typically be processed in steps which also   inconsistent data collected in the field .
                                                                                                3
            involve data validation and inspection. The goal is in   Data cleansing may be a cyclical, repetitive process
            any case to obtain a robust database for subsequent   because in order to detect some artefacts a certain
            processing, i.e., any ambiguities, missing assignments   level of cleanliness is required in the first place. Also,
            or  contradictions  should  be  detected  and  resolved   when it comes to processing larger amounts of data,
            prior to creation of actual deliverable output.    data need to have some formal structure before
            Data preparation and validation is usually a multi-  meaningful checking procedures can be applied effi-
            step process starting with coarse “data cleansing”   ciently.
            on input data basis (e.g., visual inspection of data in
            Excel® files and alignment with log data).         11�3  Naming and formatting conventions
            As a database is an efficient environment for data   The following conventions are essential to ensure
            inspection and structural checks, data cleansing is   error-free and efficient data processing over the
            typically a cyclical, incremental process.         whole chain.



                                                       Methodology for inter-operator and cross-border P2P money transfers  19
   16   17   18   19   20   21   22   23   24   25   26