Page 126 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 4 – AI and machine learning solutions in 5G and future networks
P. 126

ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 4


















                    (a) XGBoost                         (b) LightGBM                        (c) Random forest













                   (d) Decision tree                      (e) SVM                              (f) MLP
                                        Fig. 13 – Comparison of confusion matrix of different models

                                                               5.
          actual  class.  But  the  difference  from  the  precision  rate  is   CONCLUSION
          that the recall rate focuses more on the proportion of sam‑
                                                               In  this  paper,  we  employ  a  highly  practical  and  reliable
          ples  which  are  True  Positive  (TP)  that  are  successfully
                                                               approach  to  solve  the  problem  about  how  to  automati‑
          predicted.  The F1 score is the weighted average of pre‑
                                                               cally and rapidly detect network and device failures. First,
          cision  and  recall.  Therefore,  this  score  takes  both  false
                                                               we    ine  a  staged  method  including  feature  extraction
          positives and false negatives into account.  The core idea
                                                               from unstructured network logs, a differential approach
          of F1 score is to improve precision and recall as much as
                                                               to highlight the differences between normal and abnor‑
          possible while also hoping that the differences between
          the two is as small as possible.                     mal states and several ML models to realize failure classi‑
                                                                 ication.  Then, we apply the staged method to six popu‑
          Table 3 shows the total accuracy of different models.  XG‑   lar machine learning algorithms, including Decision Tree
          Boost shows the best performance of accuracy with the   (DT),  XGBoost,  LightGBM,  Multilayer  Perceptron  (MLP),
          least training time, and then LightGBM and Random For‑   Random Forest (RF), and Support Vector Machine (SVM).
          est.  Decision Tree,  MLP, and SVM relatively show lower   After a comparative evaluation,  we reveal that the tree‑
          accuracy.  Results prove the stable and outstanding per‑   based models (such as XGBoost) outperform others in de‑
          formance of tree‑structured models, as well as the lifting   tecting network failures.  Third,  we employ a model  re‑
          performance of the ensemble learning methods.  We be‑    inement method to sort the features according to their
          lieve that the reason for the low accuracy of the Decision   importance score.  We con irm that with the most useful
          Tree is the over itting of the training model.  As for MLP,   features  can  gain  computational  speed  without  obvious
          the  inal classi ication performance strongly depends on   degradation of accuracy. Finally, we also  ind that latency
          whether the optimal solution can be found. However, the   and loss are confused according to the RF confusion ma‑
          back propagation algorithm in MLP tends to converge to   trix so that they are hard to predict inherently.
          the local optimum, so the classi ication accuracy cannot   Overall, our results achieve a reliable method for detect‑
          be ensured.  SVM yields the longest training time but the   ing  network  failures:  almost  100%  accuracy  when  de‑
          lowest prediction accuracy. The reason may be concluded   tecting network and device failures, 86% accuracy when
          into the inherent computational complexity of SVM and   detecting  packet  loss  and  delay,  and  a  total  average  of
          the in luence of the unrelated features in the data.  94% accuracy. At the same time, the proposed feature ex‑
                                                               traction and re inement method can reduce computation
          Table 5 shows the precision, recall, and F‑measure results
                                                               without degrading the performance.
          of each machine learning method on different failures.  It
                                                               From the evaluation results of different types of models,
          can be seen that Node  Down failure can be well predicted
                                                               we know that different methods are capable of learning
          in every model with high precision and recall.
                                                               some  parts  of  the  problem,  but  not  the  whole  space  of
          Fig. 13 shows the diagonal of the confusion matrix repre‑   the problem. This may constitute the potential objects for
          sents the number of samples that are correctly classi ied,   future studies.  We can build multiple different learners
          while others are wrongly classi ied.                 and  we  use  them  to  build  an  intermediate  prediction,




          110                                © International Telecommunication Union, 2021
   121   122   123   124   125   126   127   128   129   130   131