Page 126 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 4 – AI and machine learning solutions in 5G and future networks

P. 126

ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 4

(a) XGBoost (b) LightGBM (c) Random forest

(d) Decision tree (e) SVM (f) MLP
Fig. 13 – Comparison of confusion matrix of different models

5.
actual class. But the difference from the precision rate is CONCLUSION
that the recall rate focuses more on the proportion of sam‑
In this paper, we employ a highly practical and reliable
ples which are True Positive (TP) that are successfully
approach to solve the problem about how to automati‑
predicted. The F1 score is the weighted average of pre‑
cally and rapidly detect network and device failures. First,
cision and recall. Therefore, this score takes both false
we ine a staged method including feature extraction
positives and false negatives into account. The core idea
from unstructured network logs, a differential approach
of F1 score is to improve precision and recall as much as
to highlight the differences between normal and abnor‑
possible while also hoping that the differences between
the two is as small as possible. mal states and several ML models to realize failure classi‑
ication. Then, we apply the staged method to six popu‑
Table 3 shows the total accuracy of different models. XG‑ lar machine learning algorithms, including Decision Tree
Boost shows the best performance of accuracy with the (DT), XGBoost, LightGBM, Multilayer Perceptron (MLP),
least training time, and then LightGBM and Random For‑ Random Forest (RF), and Support Vector Machine (SVM).
est. Decision Tree, MLP, and SVM relatively show lower After a comparative evaluation, we reveal that the tree‑
accuracy. Results prove the stable and outstanding per‑ based models (such as XGBoost) outperform others in de‑
formance of tree‑structured models, as well as the lifting tecting network failures. Third, we employ a model re‑
performance of the ensemble learning methods. We be‑ inement method to sort the features according to their
lieve that the reason for the low accuracy of the Decision importance score. We con irm that with the most useful
Tree is the over itting of the training model. As for MLP, features can gain computational speed without obvious
the inal classi ication performance strongly depends on degradation of accuracy. Finally, we also ind that latency
whether the optimal solution can be found. However, the and loss are confused according to the RF confusion ma‑
back propagation algorithm in MLP tends to converge to trix so that they are hard to predict inherently.
the local optimum, so the classi ication accuracy cannot Overall, our results achieve a reliable method for detect‑
be ensured. SVM yields the longest training time but the ing network failures: almost 100% accuracy when de‑
lowest prediction accuracy. The reason may be concluded tecting network and device failures, 86% accuracy when
into the inherent computational complexity of SVM and detecting packet loss and delay, and a total average of
the in luence of the unrelated features in the data. 94% accuracy. At the same time, the proposed feature ex‑
traction and re inement method can reduce computation
Table 5 shows the precision, recall, and F‑measure results
without degrading the performance.
of each machine learning method on different failures. It
From the evaluation results of different types of models,
can be seen that Node Down failure can be well predicted
we know that different methods are capable of learning
in every model with high precision and recall.
some parts of the problem, but not the whole space of
Fig. 13 shows the diagonal of the confusion matrix repre‑ the problem. This may constitute the potential objects for
sents the number of samples that are correctly classi ied, future studies. We can build multiple different learners
while others are wrongly classi ied. and we use them to build an intermediate prediction,

121 122 123 124 125 126 127 128 129 130 131