Page 56 - Proceedings of the 2018 ITU Kaleidoscope
P. 56

2018 ITU Kaleidoscope Academic Conference






                                              Insight

                         Anomaly detection              Diagnose labels

                                                                                           Insight
                                                                     Raise the most important anomalies

                                                     Diagnosis

                               Anomaly
                                event                      CBR              Active
                               pattern                    diagnosis        learning    Diagnosis

                                                                            KB






                                   Figure 4 – An overview of active learning in the diagnosis process
           during operation. As analyzed anomaly patterns are added,
           the quality of the automated diagnoses improves. The   5.3  Transfer Learning
           collection and  maintenance of a sufficient  knowledgebase
           can be, however, expensive and time-consuming. To   Because the fault  states can be very rare, even  with the
           mitigate this process, active learning methods can be used.   augmented diagnosis it can be difficult and time-consuming
           In active learning, the system raises those cases to be   to create a comprehensive  diagnosis knowledgebase. It
           analyzed and labeled by the human expert that are the hardest   would also be highly desirable to be able to diagnose and
           for the diagnosis function to diagnose or that would improve   quickly remedy or even prevent problems that have never
           the quality of the diagnosis knowledgebase the most. This   occurred in the system before. One way to enable this is to
           way  it can guide the human expert diagnosis process to   use transfer learning to share diagnosis knowledge between
           analyze the anomalies that are the most meaningful for the   different networks.
           automated diagnosis.
                                                              [10] describes a framework for sharing diagnosis knowledge
           Figure 4 shows an overview of active learning in the  and presents an example using topic modeling and Markov
           diagnosis process. The black dotted lines depict the flow of   Logic Networks (MLNs). It defines three components for a
           data  and  the  continuous  blue  lines  the  insights shared  diagnosis cloud:  Central, Gateway and Local Diagnostic
           between  the machine analytics  and human  expert. The  Agents (CDA, GDA and LDA). The GDA is an agent
           iterative process works as follows:                mapping  models between the CDA’s central  storage of
                                                              models and the local models in an LDA. Sharing knowledge
               1.  Initially, the detected anomalies are diagnosed by
                  the CBR-based diagnosis.                    between   diagnosis  knowledgebases  enables  fast
                                                              “bootstrapping”  of  completely  new   self-healing
               2.  The diagnosis results are fed to the active learning  deployments, or updating an existing one, for example when
                  component, which  analyzes, which  diagnosed  managed network functions go through major upgrades. It
                  anomalies are the most relevant to be raised to the  raises also the question, how such diagnosis knowledge can
                  human operator, i.e. the ones where the automatic  best be shared. Standardized knowledge sharing  methods
                  diagnoses are the most unreliable or the ones which  may be required in addition to sharing data.
                  are on the border of different diagnoses.
                                                                       6   HOLISTIC SELF-HEALING
               3.  The human expert analyzes the raised cases and
                  provides the analysis results as new  labeled and
                  diagnosed anomaly patterns into the diagnosis  Another recurring principle in resilient systems is holism. In
                  component.                                  a complex system, improving the resilience of only one part
                                                              or level of organization can sometimes (unintentionally)
               4.  Steps 1-3 are repeated for:                introduce fragility in another. To improve the resilience, it is
                      a.  Refining the labelling of existing  often necessary to work in more than one domain and scale
                         anomalies                            at a time. [3]
                      b.  Incorporation of newly detected     In mobile  network management,  this means we  cannot
                         anomalies in the labelling           consider different  management domains and levels in





                                                           – 40 –
   51   52   53   54   55   56   57   58   59   60   61