Page 118 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 4 – AI and machine learning solutions in 5G and future networks

P. 118

ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 4

phase via observing the changes in accuracy which are they are usually low inef iciency and high in human labor
trained by different numbers of features, we have two ob‑ costs [8, 9]. Therefore, approaches including Finite State
servations: (1) The highest accuracy is 94% when the Machine (FSM) and probabilistic approaches have also
number of features is more than 150; (2) Accuracy could been researched [10–12]. Authors in [10] propose an
achieve 93% if we use only the top 30 most important fea‑ FSM‑based model and realize fault detection of partially
tures, without obvious performance degradation. observed data sequences. With the aid of FSM, [11] em‑
According to our evaluation, we achieve a 100% re‑ ploy a probability approach to choose to synchronize con‑
call rate when detecting the following four network and ditions and optimally develop adaptive strategies. How‑
device failures: Node‑Down, Interface‑Down, Ixnetwork‑ ever, these traditional methods can hardly handle the fre‑
BGP‑Injection, and packet loss & delay. There is a 71% re‑ quent and dynamic changes in the network topology. On
call rate of Ixnetwork‑BGP‑Hijacking detection, while the the other hand, the data volume obtained from managed
total average accuracy of our proposal is 94%. XGBoost, entities is increasingly large in the era of 5G, and huge
Random Forest, and LightGBM [6] have been demon‑ bene its can be leveraged from data‑driven fault detection
strated in our experiments that they outperform other methods.
methods in terms of training and inference time. With the spread of the usage of Machine Learning (ML)
In summary, the main contributions in this paper are as technology in many ields, more and more studies have
follows. been proposed on network fault analysis using ML. Net‑
working itself can also bene it from this promising tech‑
• First, we de ine a staged method including feature
nology. I F Kilinçer et al. propose a Bayesian method for
extraction from unstructured network logs, a differ‑ monitoring and diagnosing faults that may occur in the
ential approach to highlight the differences between Internet line [13]. They extract data via edge switching
normal and abnormal states and several ML models devices in a network campus area and use the Bayesian
to realize failure classi ication. method to classify. It has been found that the accuracy
• Then, we apply the staged method to six popu‑ of the classi ication results is over 90%. Ruiz et al. pro‑
lar machine learning algorithms, including Decision pose a probabilistic failure localization algorithm based
Tree (DT), XGBoost, LightGBM, Multilayer Percep‑ on Bayesian Networks (BN) to localize and to identify the
tron (MLP), Random Forest (RF), and Support Vector most probable cause of failures impacting a given ser‑
Machine (SVM). After a comparative evaluation, we vice [14]. The authors use time‑series monitoring data
reveal that the tree‑based models (such as XGBoost) extracted from several light paths. When a service detects
outperform others in detecting network failures. excessive errors, an algorithm uses the trained BN to lo‑
calize and identify the most probable cause of the errors
• Third, we employ a model re inement method to sort at the optical layer. Sauvanaud et al. propose anomaly de‑
the features according to their importance score. We tection and root cause localization for VNF using a super‑
con irm that with the most useful features we gain vised machine learning algorithm [15]. This approach de‑
computationalspeedwithoutobviousdegradationof tects Service Level Agreements’(SLA) violations based on
accuracy. monitoring data. It can pinpoint the root anomalous VNF
VM causing SLA violations and achieve high recall, high
• Finally, we also ind that latency and loss are con‑
precision, and low false alarm rate. Their experiments
fused according to the RF confusion matrix so that
in [13, 14], and [15] show that the proposed algorithm
they are hard to predict inherently.
can achieve high accuracy of fault classi ication. However,
they do not compare their method with multiple ML algo‑
The rest of this paper is structured as follows. Section 2
rithms or other training conditions.
describes the relevant research on network fault analysis.
Srinikethan et al. compare three ML algorithms that in‑
In Section 3, we present our extraction method from raw
clude SVM, MLP, and RF performance in terms of their link
data and comparative analysis of ML‑based faults classi i‑
fault detection [16]. The authors develop a three‑stage
cation. Section 4 shows the experimental results obtained
Machine Learning‑based technique for Link Fault Identi‑
using our method and the evaluation of comparison re‑
sults. Finally, we provide a brief conclusion in Section 5. ication and Localization (ML‑LFIL) by analyzing the mea‑
surements captured from the usual traf ic lows, includ‑
ing aggregate low rate, end‑to‑end delay, and packet loss.
2. RELATED WORK Stadler et al. propose a method to predict service‑level
There has been numerous literature concerning network metrics from network device statistics using ML [17]. The
faults detection. Most approaches rely on prede ined authors adopt a work‑regression tree and RF and inves‑
rules, thresholds, and expert experiments. Mitchell et al. tigate their prediction performance. They also compare
present a fault detection system for LAN networks [7]. the performance under several training conditions. Ref‑
The system is based on a set of rules de ined on the data erences [16] and [17] compare the performance of multi‑
collected from the network monitoring process and the ple ML algorithms and seek to ascertain the effect of train‑
expertise of the network administrators. Although these ing conditions. However, their ML model’s goal is fault
methods can be realized automatically through scripts, detection and predictive service metrics, and it does not
cover enough fault classification.

102 © International Telecommunication Union, 2021

113 114 115 116 117 118 119 120 121 122 123