Page 77 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 4 – AI and machine learning solutions in 5G and future networks

P. 77

ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 4

Fig. 5 – High‑level work low of NetXplain.
to perform real‑time troubleshooting of GNN‑based solu‑
tions applied to networks but also opens the possibility
of combining these solutions with automatic optimization Fig. 6 – Adaptation of the readout function in NetXplain to produce the
algorithms (e.g., local search, reinforcement learning) to explainability mask.
solve more ef iciently online optimization problems, as
discussed later in Section 7. To this end, NetXplain uses a of the samples of the original data sets. Consequently,
GNN that learns how to interpret a target GNN model that the cost of generating the explainability data set becomes
has been trained for a particular task. As shown in Fig. 5, much more affordable than applying the iterative opti‑
the proposed GNN‑based solution is trained with an ex‑ mization algorithm over all the samples of .
plainability data set generated by an iterative optimiza‑
tion algorithm [3] and, once trained, the resulting model 5.2 Training the explainability GNN
can make one‑step explainability predictions for each in‑
put sample of the target GNN. Note that thanks to the Finally, we propose the use of an independent GNN
generalization capabilities of GNN over graph‑structured (NetXplain) to learn how to predict explainability masks

information, once NetXplain is trained over a particular over the target GNN for an input graph = ( , ).
target GNN solution, it can be applied to different input First, we must de ine the underlying architecture of the
graphs not included in the training data set. In prac‑ NetXplain GNN, which we use for training. Particularly,
tice, when applied to GNN‑based networking solutions, we mostly keep the same architecture of the target GNN.
NetXplain is able to generalize to network scenarios with The intuition behind this decision is that the complexity
topologies of variable size and structure not seen in ad‑ for the target GNN to learn how to make its output predic‑
vance, as shown later in the experiments of Section 6. The tions should be similar to solving the explainability prob‑
following subsections describe in more detail the main lem over that GNN (i.e., explaining which connections af‑
components of this solution. fected most such predictions). However, it is needed to
make a minor change on the readout function (·), in or‑
5.1 Explainability data set der to adapt it to produce the explainability mask . As

illustrated in Fig. 6, for every edge ( , ) ∈ , we con‑
To train NetXplain, we irst need to generate the new ex‑ catenatetheir inalhidden‑statevectorsafterthe message

plainability data set, which we refer to as . To this end, passing phase is inished (i.e., ℎ || ℎ ) and this is passed

′
we irst randomly sample a subset ⊆ , where is as input to (·), which predicts the mask weight for that
the original data set used to train the target GNN. Given edge . Note that this operation can be computed in
,
′
this subset , we now target the problem of producing, parallel for each node pair ( , ) ∈ of the input graph.
′
for each input graph ∈ , its associated explainabil‑ A key aspect of our proposal is to reduce as much as pos‑
ity mask when applied to the target GNN. Note that sible the subset of samples randomly selected ( ) used
′

this process is made from a black‑box perspective (i.e., the to generate the samples of the explainability data set ( ),
explainability mask interprets the relevance of the input which are inally used to train NetXplain’s GNN. The rea‑
graph connections by analyzing the input‑output correla‑ son is that typically producing explainability masks for all
tions in the target GNN). For this task we can use speci ic the samples of the original data set may be too costly
state‑of‑the‑art iterative optimization algorithms, intro‑ with state‑of‑the‑art explainability solutions. To achieve
duced in Section 3, and further described in Section 4.2, this, we follow a transfer learning approach. Particularly,
depending on the particularities and the purpose of the we irst initialize the explainability GNN with the same in‑
target GNN (e.g., regression, classi ication). ternal parameters (i.e., weights and biases) of the target
Thus, we apply the process described in Section 4.2 for GNN model, except for the readout function, whose imple‑
′
each of the samples ∈ . Hereby, we eventually ob‑ mentation differs as explained before. This enables us to
tain the inal explainability data set , formally de ined effectively initialize the explainability GNN model, as the
in Eq. (8), which maps each of the selected graphs to its message‑passing functions of this GNN are expected to be
∗
corresponding optimal mask .
close to those of the target GNN (e.g., similar graphs and
′
∗
= {( , ) | ∈ } (8) feature distributions). Thus, during training, the main ad‑

justment should be made over the readout function. To
Note that due to the high cost of computing the explain‑ this end, we inally train the explainability model with a
′
ability data set, it is crucial to ensure that | |<<| |. For reduced explainability data set generated by a refer‑
instance, in our experiments, we observe that NetXplain ence explainability algorithm, and this enables us to learn
is able to converge to a valid solution using only 5‑10% how to produce accurately explainability masks.
© International Telecommunication Union, 2021 61

72 73 74 75 76 77 78 79 80 81 82