Page 112 - Big data - Concept and application for telecommunications
P. 112

3                                Big data - Concept and application for telecommunications



            –       reproduce  an  execution  from  provenance  for  big  data  applications:  In  some  case  of  big  data
                    execution,  the  environment  information  (e.g.,  hardware  (H/W)  information  and  parameter
                    configuration of big data engines) is an important factor.
            The application area of big data provenance and its benefits are:
            –       collaborative big data analysis: Big data provenance allows collaboration of big data analysis among
                    multiple domains or applications by data sources information and their process steps;
            –       reuse of data processing: Generally, a big data analysis has complex process steps. Thus, a well-
                    defined analysis model which can be derived from provenance information is helpful for a similar
                    case of big data processing;

                    NOTE 3 – In data processing system, data processing means a course of events occurring according
                    to an intended purpose of effect.

            –       automating big data analysis process: Provenance gives a context in which to use the data, and
                    allows automated validation and revision of derived data when the base data is updated;
            –       audit and protect intellectual property: Provenance gives a lineage of data, and it allows auditing
                    and tracing of digital rights on mash-up data.


            7       Overview of big data provenance

            This clause presents an overview of big data provenance. This clause describes data provenance in a big data
            ecosystem, a conceptual model, provenance operations, and logical components for big data provenance.


            7.1     Data provenance in big data ecosystem
            According to [ITU-T Y.3600], a big data service provider (BDSP) supports data provenance as a part of data
            management by managing information about the origin and generation process methods of data, including
            the party or parties involved in the generation, introduction and/or mash-up processes for data.




































                                  Figure 7-1 – Using data provenance in big data ecosystem







            104      Static data – Data provenance, data formats and trust
   107   108   109   110   111   112   113   114   115   116   117