Page 337 - Cloud computing: From paradigm to operation
P. 337
Framework and requirements for cloud computing 1
– Resiliency. Cloud computing can support big data to have resiliency capabilities to maintain an
acceptable level of service in the face of faults affecting normal operation.
– Cost effectiveness. Big data facilitates fast and scalable data processing such as system log analysis
and click streams analysis. For many systems and platforms, there are huge volumes of log data and
traditionally databases are used to perform log analysis. But the cost to perform data analysis
(including costs of storage, system maintenance, etc.) is too high when traditional mechanisms are
used. Cloud computing can offer flexible and scalable resources in a cost effective manner.
– Efficient analysis. In order to extract more valuable insights, big data applications and services need
a well-defined analytic strategy as well as processing power. The cloud computing based big data
service may dynamically use the required resources.
– Deep information extraction. Big data develops new business insights and mechanisms including
prediction and decision assistance. This is different from conventional systems because the data
processing logic to handle the raw data and what kind of information can be extracted from datasets
is already known.
8 Requirements of cloud computing based big data
8.1 Data collection requirements
The data collection requirements include:
1) It is required for the CSP:BDIP to support collecting data from multiple CSN:DPs in parallel;
2) It is recommended for the CSN:DP to expose data to the CSP:BDAP by publishing metadata;
3) It is recommended that the CSP:BDIP supports collecting data from different CSN:DPs with different
modes;
NOTE – Data could be collected in different modes, such as pull mode in which the data collection process is
initiated by CSP:BDIP, or push mode in which the data collection process is initiated by the CSN:DP.
4) It is recommended for the CSN:DP to provide a brokerage service to the CSP:BDIP for searching
accessible data;
NOTE – Brokerage provides data a catalogue which has data information such as data specification, data
instructions, electronic access methods, license policy, data quality, etc.
5) It is recommended that the CSP:BDIP integrates data delivered by the CSC and data publicly
available;
6) Data collection can optionally be performed by the CSP:BDIP in real-time.
8.2 Data pre-processing requirements
The data pre-processing requirements include:
1) It is required for the CSP:BDIP to support data aggregation;
NOTE – Data from different sources can be organized in the same model or data format, as described in
clause 6.1.
2) It is recommended that the CSP:BDIP provides the dedicated resources for pre-processing;
NOTE – Pre-processing includes extraction, transformation and de-noising of the collected data.
3) It is recommended that the CSP:BDIP supports unification of data collected in different formats;
NOTE –Unification of data is for example to unify data about persons/locations/dates extracted from web
pages, pictures, videos, SNS data and calling logs to text format.
4) It is recommended for the CSP:BDIP to support extraction of data from unstructured data or semi-
structured data into structured data.
NOTE – This requirement can be applied also to data storage.
329