Page 21 - Methodology for inter-operator and cross-border P2P money transfers
P. 21
Figure 5 – Symbolic overview of data structure for post-processing
The lists are described in detail in sections Local Examples for data which may need to be cleaned out
log sheets, Team Assignment List (TAL) and Device are:
Assignment List (DAL).
• Data resulting from test runs which have not been
11�2 Data Structure Overview done under defined conditions.
The overall data structure is shown in Figure 5. It is • Data taken in situations which are deemed to be
assumed that this data structure will exist in a central exceptional and should not be part of statistics.
data processing environment, typically a SQL data- • Data resulting from unintended operation, e.g. a
base. wrong PIN, or from an operation cancelled due to
It needs to pointed out that the methodology does some other wrong entry.
not provide a single, prescribed data structure. Actu-
al data structures can have additional members and Pre-cleaned data is imported to the database,
shapes. Also, there is no absolute way to process checked, corrected in case errors are detected, and
data; the methodology can therefore be embedded imported again until the desired state is reached.
in a wide range of post-processing environments and During this process, individual data sets may be
tool chains. This applies, in particular, to background “masked out”, i.e., tagged as to be ignored during
testing (network KPI) data which can be created in further processing steps. This is necessary if informa-
multiple ways. tion is incomplete or contradictory due to missing or
Data will be typically be processed in steps which also inconsistent data collected in the field .
3
involve data validation and inspection. The goal is in Data cleansing may be a cyclical, repetitive process
any case to obtain a robust database for subsequent because in order to detect some artefacts a certain
processing, i.e., any ambiguities, missing assignments level of cleanliness is required in the first place. Also,
or contradictions should be detected and resolved when it comes to processing larger amounts of data,
prior to creation of actual deliverable output. data need to have some formal structure before
Data preparation and validation is usually a multi- meaningful checking procedures can be applied effi-
step process starting with coarse “data cleansing” ciently.
on input data basis (e.g., visual inspection of data in
Excel® files and alignment with log data). 11�3 Naming and formatting conventions
As a database is an efficient environment for data The following conventions are essential to ensure
inspection and structural checks, data cleansing is error-free and efficient data processing over the
typically a cyclical, incremental process. whole chain.
Methodology for inter-operator and cross-border P2P money transfers 19