Page 327 - Big data - Concept and application for telecommunications
P. 327
Big data - Concept and application for telecommunications 5
7.2.2 Users' behaviour data
It is required to choose main services both in packet switch (PS) and circuit switch (CS) domains to monitor
the users' behaviour.
For the PS domain the data is collected through the Gn/S1-U interface [b-3GPP TS29.060] and the S1-U
interface [b-3GPP TS36.413], including but not limited to: who are the users, when, where and what they are
doing on the Internet, how much traffic they consume and what the result is.
For the CS domain the data is collected through a mobile switching centre (MSC) in CN, including but not
limited to: who are the users, when, where and to whom they are using voice call, how long the voice call
lasts and what the result is.
7.2.3 Common parameter data
7.2.2.1 Common parameters in networks
Common parameter data in networks mainly includes the distribution of base transceiver stations and cell
key parameters. Users' behaviour data and signal data should be computed for every cell according to a set
of indexes which can represent the running status of the mobile network. For base transceiver stations these
indexes can be summarized by their own cells.
It is required to collect data items such as the cell information, the base station which the cell belongs to, the
network type, the location of the base station (i.e. the longitude and the latitude of the base station or the
administrative region that the base station belongs to).
7.2.2.2 Common parameters in applications
Common parameter data in applications mainly includes the classification and names of applications that
users enjoy. Users' behaviour data and signal data should be computed for every application classification or
popular application according to a group of indexes which can represent the service sensing. For every cell
or base transceiver station these indexes can be summarized by users' accesses.
7.2.2.3 Common parameters in users
Common parameter data in users mainly includes the mobile station international ISDN number (MSISDN)
and the terminals they use. The users' behaviour data and signal data should be computed for every place
the MSISDN belongs to or the terminal type that users have according to a group of indexes which can
represent the users' experience in the network and service. For every cell or base transceiver station these
indexes can be summarized by users' accesses.
It is required to collect data items such as the user identity, the MSISDN, the international mobile subscriber
identity (IMSI) and the terminal information, i.e. the terminal brand, the terminal model, the terminal
operating system.
7.3 Data collection methods
There are different methods to collect different data categories.
The users' behaviour data is collected by deep packet inspection (DPI) from the Gn/S1-U interface. For
information on DPI requirements, see [ITU-T Y.2770] and for information on the DPI framework, see [ITU-T
Y.2771].
The signal data in the RAN is collected from network management elements through files and the signal data
in CN is collected by DPI from the corresponding interfaces.
The common parameter data is collected from external databases.
7.4 Requirements for big data pre-processing
The data collected through different methods is stored in files, messages, queues or databases. Before
analysis and storage, the data needs to be pre-processed. There are three stages of data pre-processing; data
extraction, data transformation and data load.
Network and infrastructure 319