Page 80 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 4 – AI and machine learning solutions in 5G and future networks
P. 80
ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 4
( ) test & troubleshooting, ( ) reverse engineering, and 7.2 Reverse engineering
( ) improving optimization tasks. Particularly, we put
the focus on the advantages of leveraging the fast and One interesting application of ML‑based solutions is to ex‑
low‑cost interpretations of NetXplain with respect to tract information about the knowledge learned during the
state‑of‑the‑art explainability methods. training phase (i.e., reverse engineering). In this context,
the explainability interpretations produced by NetXplain
7.1 Test & troubleshooting would enable us to understand what are the main net‑
work elements that GNNs consider before making their
In order to achieve GNN‑based products for networking, decisions. As a result, this may enable us to obtain non‑
we need guarantees that they will work optimally when trivial knowledge that can be leveraged to then design
deployed in real‑world networks. In this context, ven‑ and implement ef icient optimization algorithms and/or
dors would typically need to make extensive tests to their heuristics with deterministic and predictable behavior.
GNN solutions to check how they respond under differ‑ These kinds of solutions are often perceived as more valu‑
ent network conditions. Using NetXplain would enable us able by network operators, as nowadays there is a cer‑
to collect human‑readable interpretations of the internal tain skepticism on applying ML‑based solutions to real‑
data processing made by GNNs. For instance, if we have world networks, mainly due to the critical nature of these
a GNN model that performs tr ic engineering, we can infrastructures and the probabilistic guarantees typically
identify the network elements that mainly drive the deci‑ offered by ML solutions.
sions made by the model, which are given by the explain‑
ability mask of NetXplain, and then observe if the proper‑
7.3 Improving network optimization
ties of the selected elements are consistent across similar
solutions
network scenarios. This would be a good indicator that
the model generalizes well and, consequently, it is reliable
Network optimization problems often require dealing
for deployment. In this vein, with extensive testing we
with very large spaces of possible actions (e.g., all the
can ind the safe operational range of models, which is es‑
valid src‑dst routing combinations in a network). As a re‑
sential for vendors to offer guarantees before selling their
sult, optimization tools can only evaluate a small portion
products (e.g., this product works optimally in networks
of con igurations before they make a inal decision. Thus,
up to 100 nodes and link capacities up to 40Gbps). Other‑
the exploration strategy used by these tools has a critical
wise, operators would not take the risk of deploying such
solutions on their networks, as they are critical infras‑ impact on the performance they can eventually achieve.
tructures where miscon igurations are not acceptable. In In this context, explainability methods can provide
this context, making such a comprehensive analysis us‑ meaningful interpretations of the current network state
ing state‑of‑the‑art solutions would result in large costs that can be useful to guide more iciently optimiza‑
for vendors; while the limited cost of NetXplain would en‑ tion algorithms (e.g., reinforcement learning [15], local
able us to reduce dramatically both the cost and the time search [19]). For instance, using a NetXplain model
needed before releasing the product to the market. trained over RouteNet, as the one of Section 6, would en‑
Moreover, this testing process would enable us to trou‑ able us to point to critical paths and links that are mostly
bleshoot GNN models by identifying particular scenarios affecting the network performance (e.g., end‑to‑end de‑
where they are not focusing on the expected elements, or lays). This could be highly bene icial for optimization al‑
simply their behavior is not consistent with other simi‑ gorithms to explore alternative con igurations targeting
lar scenarios. In this context, understanding where and ically these critical points (e.g., re‑routing speci ic
why a model failed is crucial to re ine it through an itera‑ paths to avoid the critical points selected by NetXplain).
tive training‑testing process. For instance, it can help ind In this context, computational ef iciency is a must for op‑
de iciencies in the internal message‑passing architecture timization tools, as it directly affects the number of con‑
that make the model less robust to particular network igurations that can be evaluated before producing the i‑
scenarios or identify a lack of samples in the training nal decision. Thus, counting on solutions compatible with
data sets. real‑time operation, like NetXplain, offers an important
competitive advantage with respect to state‑of‑the‑art ex‑
plainability solutions.
8. CONCLUSIONS
In this paper, we proposed NetXplain, an ef icient explain‑
ability solution for Graph Neural Networks (GNNs). Par‑
ticularly, this solution uses a GNN that learns how to pro‑
duce accurate interpretations over the outputs produced
Fig. 8 – Possible applications of NetXplain.
64 © International Telecommunication Union, 2021