Page 77 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 4 – AI and machine learning solutions in 5G and future networks
P. 77

ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 4











                   Fig. 5 – High‑level work low of NetXplain.
          to perform real‑time troubleshooting of GNN‑based solu‑
          tions applied to networks but also opens the possibility
          of combining these solutions with automatic optimization  Fig. 6 – Adaptation of the readout function in NetXplain to produce the
          algorithms (e.g., local search, reinforcement learning) to  explainability mask.
          solve more ef iciently online optimization problems, as
          discussed later in Section 7. To this end, NetXplain uses a  of the samples of the original data sets. Consequently,
          GNN that learns how to interpret a target GNN model that  the cost of generating the explainability data set becomes
          has been trained for a particular task. As shown in Fig. 5,  much more affordable than applying the iterative opti‑
          the proposed GNN‑based solution is trained with an ex‑  mization algorithm over all the samples of   .
          plainability data set generated by an iterative optimiza‑
          tion algorithm [3] and, once trained, the resulting model  5.2 Training the explainability GNN
          can make one‑step explainability predictions for each in‑
          put sample of the target GNN. Note that thanks to the  Finally, we propose the use of an independent GNN
          generalization capabilities of GNN over graph‑structured  (NetXplain) to learn how to predict explainability masks
                                                                   
          information, once NetXplain is trained over a particular     over the target GNN for an input graph    = (   ,   ).
          target GNN solution, it can be applied to different input  First, we must de ine the underlying architecture of the
          graphs not included in the training data set. In prac‑  NetXplain GNN, which we use for training. Particularly,
          tice, when applied to GNN‑based networking solutions,  we mostly keep the same architecture of the target GNN.
          NetXplain is able to generalize to network scenarios with  The intuition behind this decision is that the complexity
          topologies of variable size and structure not seen in ad‑  for the target GNN to learn how to make its output predic‑
          vance, as shown later in the experiments of Section 6. The  tions should be similar to solving the explainability prob‑
          following subsections describe in more detail the main  lem over that GNN (i.e., explaining which connections af‑
          components of this solution.                         fected most such predictions). However, it is needed to
                                                               make a minor change on the readout function   (·), in or‑
          5.1 Explainability data set                          der to adapt it to produce the explainability mask    . As
                                                                                                             
                                                               illustrated in Fig. 6, for every edge (  ,   ) ∈   , we con‑
          To train NetXplain, we  irst need to generate the new ex‑  catenatetheir inalhidden‑statevectorsafterthe message
                                                                                                
                                                                                            
          plainability data set, which we refer to as   . To this end,  passing phase is  inished (i.e., ℎ || ℎ ) and this is passed
                                                                                                
                                                                                            
                                         ′
          we  irst randomly sample a subset    ⊆   , where    is  as input to   (·), which predicts the mask weight for that
          the original data set used to train the target GNN. Given  edge    . Note that this operation can be computed in
                                                                        ,  
                     ′
          this subset    , we now target the problem of producing,  parallel for each node pair (  ,   ) ∈    of the input graph.
                                  ′
          for each input graph    ∈    , its associated explainabil‑  A key aspect of our proposal is to reduce as much as pos‑
          ity mask    when applied to the target GNN. Note that  sible the subset of samples randomly selected (   ) used
                                                                                                          ′
                     
          this process is made from a black‑box perspective (i.e., the  to generate the samples of the explainability data set (  ),
          explainability mask interprets the relevance of the input  which are  inally used to train NetXplain’s GNN. The rea‑
          graph connections by analyzing the input‑output correla‑  son is that typically producing explainability masks for all
          tions in the target GNN). For this task we can use speci ic  the samples of the original data set    may be too costly
          state‑of‑the‑art iterative optimization algorithms, intro‑  with state‑of‑the‑art explainability solutions. To achieve
          duced in Section 3, and further described in Section 4.2,  this, we follow a transfer learning approach. Particularly,
          depending on the particularities and the purpose of the  we  irst initialize the explainability GNN with the same in‑
          target GNN (e.g., regression, classi ication).       ternal parameters (i.e., weights and biases) of the target
          Thus, we apply the process described in Section 4.2 for  GNN model, except for the readout function, whose imple‑
                                 ′
          each of the samples    ∈    . Hereby, we eventually ob‑  mentation differs as explained before. This enables us to
          tain the  inal explainability data set   , formally de ined  effectively initialize the explainability GNN model, as the
          in Eq. (8), which maps each of the selected graphs to its  message‑passing functions of this GNN are expected to be
                                    ∗
          corresponding optimal mask    .
                                                               close to those of the target GNN (e.g., similar graphs and
                                          ′
                                 ∗
                           = {(  ,    ) |    ∈    }    (8)     feature distributions). Thus, during training, the main ad‑
                                   
                                                               justment should be made over the readout function. To
          Note that due to the high cost of computing the explain‑  this end, we  inally train the explainability model with a
                                               ′
          ability data set, it is crucial to ensure that |   |<<|  |. For  reduced explainability data set    generated by a refer‑
          instance, in our experiments, we observe that NetXplain  ence explainability algorithm, and this enables us to learn
          is able to converge to a valid solution using only 5‑10%  how to produce accurately explainability masks.
                                             © International Telecommunication Union, 2021                    61
   72   73   74   75   76   77   78   79   80   81   82