Page 24 - Big data - Concept and application for telecommunications
P. 24

1                                Big data - Concept and application for telecommunications



                    additional  expansion  of  infrastructure.  It  allows  the  big  data  service  user  to  easily  upscale  or
                    downscale the resources quickly.
            –       Resiliency. Cloud computing can support big data to have resiliency capabilities to maintain an
                    acceptable level of service in the face of faults affecting normal operation.
            –       Cost effectiveness. Big data facilitates fast and scalable data processing such as system log analysis
                    and click streams analysis. For many systems and platforms, there are huge volumes of log data and
                    traditionally  databases  are  used  to  perform  log  analysis.  But  the  cost  to  perform  data  analysis
                    (including costs of storage, system maintenance, etc.) is too high when traditional mechanisms are
                    used. Cloud computing can offer flexible and scalable resources in a cost effective manner.
            –       Efficient analysis. In order to extract more valuable insights, big data applications and services need
                    a well-defined analytic strategy as well as processing power. The cloud computing based big data
                    service may dynamically use the required resources.

            –       Deep information extraction. Big data develops new business insights and mechanisms including
                    prediction and decision assistance. This is different from conventional systems because the data
                    processing logic to handle the raw data and what kind of information can be extracted from datasets
                    is already known.


            8       Requirements of cloud computing based big data


            8.1     Data collection requirements
            The data collection requirements include:
            1)      It is required for the CSP:BDIP to support collecting data from multiple CSN:DPs in parallel;

            2)      It is recommended for the CSN:DP to expose data to the CSP:BDAP by publishing metadata;
            3)      It is recommended that the CSP:BDIP supports collecting data from different CSN:DPs with different
                    modes;
                    NOTE – Data could be collected in different modes, such as pull mode in which the data collection
                    process is initiated by CSP:BDIP, or push mode in which the data collection process is initiated by
                    the CSN:DP.
            4)      It is recommended for the CSN:DP to provide a brokerage service to the CSP:BDIP for searching
                    accessible data;
                    NOTE – Brokerage provides data a catalog which has data information such as data specification,
                    data instructions, electronic access methods, license policy, data quality, etc.
            5)      It  is  recommended  that  the  CSP:BDIP  integrates  data  delivered  by  the  CSC  and  data  publicly
                    available;
            6)      Data collection can optionally be performed by the CSP:BDIP in real-time.

            8.2     Data pre-processing requirements

            The data pre-processing requirements include:
            1)      It is required for the CSP:BDIP to support data aggregation;
                    NOTE – Data from different sources can be organized in the same model or data format, as described
                    in clause 6.1.
            2)      It is recommended that the CSP:BDIP provides the dedicated resources for pre-processing;
                    NOTE – Pre-processing includes extraction, transformation and de-noising of the collected data.
            3)      It is recommended that the CSP:BDIP supports unification of data collected in different formats;
                    NOTE –Unification of data is for example to unify data about persons/locations/dates extracted from
                    web pages, pictures, videos, SNS data and calling logs to text format.




            16       Basics of Big data
   19   20   21   22   23   24   25   26   27   28   29