Page 24 - Big data - Concept and application for telecommunications
P. 24
1 Big data - Concept and application for telecommunications
additional expansion of infrastructure. It allows the big data service user to easily upscale or
downscale the resources quickly.
– Resiliency. Cloud computing can support big data to have resiliency capabilities to maintain an
acceptable level of service in the face of faults affecting normal operation.
– Cost effectiveness. Big data facilitates fast and scalable data processing such as system log analysis
and click streams analysis. For many systems and platforms, there are huge volumes of log data and
traditionally databases are used to perform log analysis. But the cost to perform data analysis
(including costs of storage, system maintenance, etc.) is too high when traditional mechanisms are
used. Cloud computing can offer flexible and scalable resources in a cost effective manner.
– Efficient analysis. In order to extract more valuable insights, big data applications and services need
a well-defined analytic strategy as well as processing power. The cloud computing based big data
service may dynamically use the required resources.
– Deep information extraction. Big data develops new business insights and mechanisms including
prediction and decision assistance. This is different from conventional systems because the data
processing logic to handle the raw data and what kind of information can be extracted from datasets
is already known.
8 Requirements of cloud computing based big data
8.1 Data collection requirements
The data collection requirements include:
1) It is required for the CSP:BDIP to support collecting data from multiple CSN:DPs in parallel;
2) It is recommended for the CSN:DP to expose data to the CSP:BDAP by publishing metadata;
3) It is recommended that the CSP:BDIP supports collecting data from different CSN:DPs with different
modes;
NOTE – Data could be collected in different modes, such as pull mode in which the data collection
process is initiated by CSP:BDIP, or push mode in which the data collection process is initiated by
the CSN:DP.
4) It is recommended for the CSN:DP to provide a brokerage service to the CSP:BDIP for searching
accessible data;
NOTE – Brokerage provides data a catalog which has data information such as data specification,
data instructions, electronic access methods, license policy, data quality, etc.
5) It is recommended that the CSP:BDIP integrates data delivered by the CSC and data publicly
available;
6) Data collection can optionally be performed by the CSP:BDIP in real-time.
8.2 Data pre-processing requirements
The data pre-processing requirements include:
1) It is required for the CSP:BDIP to support data aggregation;
NOTE – Data from different sources can be organized in the same model or data format, as described
in clause 6.1.
2) It is recommended that the CSP:BDIP provides the dedicated resources for pre-processing;
NOTE – Pre-processing includes extraction, transformation and de-noising of the collected data.
3) It is recommended that the CSP:BDIP supports unification of data collected in different formats;
NOTE –Unification of data is for example to unify data about persons/locations/dates extracted from
web pages, pictures, videos, SNS data and calling logs to text format.
16 Basics of Big data