Page 85 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 7 – Terahertz communications
P. 85

ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 7




          In each time slot   , the transmitter selects beam codes for  Algorithm 2 HAB Procedure in SC‑FDMA
          each subcarrier from the hierarchical k‑means codebook.  Input:    ,   ,    ,   
                                                                            2
                                                                       1
          At the receiver, the receive power    for the   th subcarrier  Output: b ∗        
                                        
          is measured and fed back to the transmitter. At the end of  1: Initialization: Set T = {(1, 1)},     =     = +∞
                                                                                     
          time slot   , the transmitter obtains the measured rewards  and    = (1, 1);          2,1    2,2
                                                                         
          of all subcarriers for this time slot and decides which arm  2: while    ≠      −1  &&    <     do
                                                                           
                                                                                              
          to select for which subcarrier based on a speci ic rule for  3:  ℎ = 1,    = 1 and ℙ = {(ℎ,   )};
          the next time slot   +1. In the NLOS scenario, the expected  4:  New Node Selection
          cumulative regrets    (  ) for   th subcarrier is given by  5:  while (ℎ,   ) ∈ T do
                                                                                    
                             
                                                                6:     Select new node (ℎ,   ) based on (36);
                             
                                   ∗
                     (  ) = ∑ (   (w ) −    (w (  ))) .  (35)   7:     ℙ = ℙ ∪ (ℎ,   );
                      
                                 
                                             
                                          
                                     
                            =1                                  8:   end while                
                                                                9:   Selected new code: (   ,    ) = (ℎ,   );
                                                                                                       
                                                                                                  
                                                                                                          
                 ∗
          where w is the optimal beam code for the   th subcar‑  10:  Decision Tree Update: T   +1  = T ∪ (   ,    );
                   
                                                                                         
          rier. The objective of the algorithm design for the NLOS  11:  Measure the reward    ;
          case is to  ind a selection policy that minimizes the sum  12:  Attribute Update
                                                                                     
          expected cumulative regret             (  ) over all subcarri‑  13:  for all (ℎ,   ) ∈ T do
                                 0 +  −1                        14:    if (ℎ,   ) ∈ ℙ then
          ers, i.e.,             (  ) = ∑     (  ).
                                        
                                                                15:      update    ℎ,   (  ),    ℎ,   (  ) and    ℎ,   (  ) with (37),
                                =   0
                                                                         (38) and (39), respectively;
          4.2  HBA procedure                                    16:    end if
                                                                17:    update    ℎ,   (  ) with (39);
          In the following, we extend the HBA algorithm proposed
                                                                18:  end for
          in  [9]  to  the  considered  LOS  and  NLOS  SC‑FDMA  trans‑
                                                                19:              (  ) =             (  ) = +∞;
          mission.  As  mentioned,  the  BA  problems  for  both  LOS      +1,2   −1         +1,2   −1
                                                                20:  for all (ℎ,   ) ∈ T do
          and  NLOS  scenarios,  respectively,  can  be  decomposed
                                                                21:    update     (  ) with (40);
          into several independent sub‑BA problems, which can be               ℎ,  
                                                                22:  end for
          solved in parallel. The HBA algorithm discussed in the fol‑
                                                                23: end while
          lowing provides an ef icient approach to solve one inde‑
          pendent BA problem, such as a sub‑band BA problem in   New node selection
          the  LOS  case  and  a  subcarrier  BA  problem  in  the  NLOS   In the new node selection phase, the HBA will select the
          case,                                 respectively.               
                                                               beam code w with maximal estimated mean reward—Q‑
          The  procedure  of  the  HBA  algorithm  is  shown  in  Algo‑
                                                               value for one subcarrier or sub‑band. The procedure of
          rithm 2.  The algorithm is designed based on the hierar‑
                                                               node selection can be found from line 3 to line 10 in Al‑
          chical  structure  of  the  average  payoff  function  with  re‑
                                                               gorithm 2. By exploiting the hierarchical codebook struc‑
          spect to the code words.  For a single BA problem,  con‑
                                                               ture, a binary tree search is implemented to  ind the new
          sider the beam code w(  ,   ) in the hierarchical codebook.
                                                               node. The binary search tree T is initialized in the begin‑
          If w(  ,   ) performs well, its left child w(   + 1, 2   − 1) and
                                                               ning. A node in T is represented by (ℎ,   ), where ℎ is the
          right child w(   + 1, 2  ) are highly likely to perform well,                            ℎ−1
                                                               depth from the root node and   , 1 ≤    ≤ 2  is the index
          too.  Hence, the algorithm will explore intensively within
                                                               at depth ℎ. The corresponding beam code of node (ℎ,   )
          the beam coverage      (w(  ,   )) with a good reward and
                                                                                           1
                                                               is w(ℎ,   ). After initialization, T contains only the root
          loosely in others. For this proposal, a search tree is gene-
                                                                                
                                                               (1, 1). Assume T is the search tree at time   . Starting
          rated  with  nodes  associated  with  the  beam  coverage,
                                                               from the root node, the Q‑values of two child nodes are
          independently for each sub‑BA problem. The node in a                                      
                                                               compared until a new node (   ,    ) ∉ T is selected. For
          higher layer covers a smaller region as discussed in the
                                                               a node (ℎ,   ), the selection criterion is to choose its child
          codebook design part. The algorithm operates in discrete    ∗
                                                               (ℎ + 1,    ) with largest Q‑value, that is:
          time slots, and in each time slot   ,  a new beam code is
          chosen by a deterministic rule based  on  the  attributes of
                                                                      ⎧ 2  ,                 (  ) >       (  )
          the  search tree.  The code selection procedure is executed   ∗  {            ℎ+1,2      ℎ+1,2  −1
                                                                    =   2   − 1,         ℎ+1,2   (  ) <    ℎ+1,2  −1 (  )
          independently  for  each  sub‑band  or  subcarrier  in  the   ⎨  2   − Ber(0.5),     (  ) =     (  )
                                                                      {
          LOS or NLOS scenario, respectively. After measurement       ⎩                 ℎ+1,2      ℎ+1,2  −1
          at the receiver, the observed rewards will be fed back to                                         (36)
          the transmitter. Afterwards, the transmitter updates the   where Ber(0.5) represents a Bernoulli distributed ran‑
          attributes of the entire search tree based on the newly   dom variable with a parameter of 0.5. All selected nodes
                                                               during the compare‑select procedure are saved in   ,
          observed  rewards     and  the  newly  selected  code  is
          added  to  the  search  tree.  By  repeating  this  selection‑  which is the path from the root node to the selected node.
                                                                                             
                                                                                          
          update  process,  the  HBA  algorithm  narrows  the  sear-  The  inally selected node (   ,    ) is added to the search
                                                                      
          ching  region  intelligently  until a close‑to‑optimal beam   tree T to obtain the search tree T   +1  for the next time slot
                                                                              
                                                                                       
                                                                                    
          code is selected.                                       + 1, T   +1  = T ∪ (   ,    ).
                                             © International Telecommunication Union, 2021                    73
   80   81   82   83   84   85   86   87   88   89   90