Page 144 - Proceedings of the 2017 ITU Kaleidoscope
P. 144

2017 ITU Kaleidoscope Academic Conference




           3.2. Video Quality Estimation using No-reference Metric   9.  Estimate the current throughputThcur and based
                                                                        on Thcur identify new state snew.
           The ITU-T G.1070 defines a model [15] for estimating the   10.  Compute new action anew← SoftMax(Q,s).
           video quality based on  measurable parameters of IP          // Exploration policy function to get best
           network. The video quality (Vq) is representedas             possible future action
             =1 +                                     (4)         11.   Update the         ,        based on (1)


           Where    represents the basic  video quality resulting  from   12.         ←      //Update new  state to the current

           the encoding distortion due to the combined effect of bit    state
           rate and frame rate,     is the factor governed by degree of   13.         ←      //Update new action to the current

           robustness due to packet loss.                               action
             is expressed in terms of bit rate(b)and frame rate   14.   Assign action        as feedback to the server

           (f)according to equations (5 - 8) as follows.          15. End
                                                                  16. Go to Step 2 and continue till streaming occurs
                    (  ( )    (     ))                            17. End

             =                                        (5)


             =   +                                    (6)         Softmax (Q,s)



                 =   +                                (7)


                                                                  1.  Initialize r = 1, offset = 0, sum = 0, flag = 0 and
             =    1 −                                 (8)            prob[] = {0}, problength = length of prob[]


                                                                  2.
                                                                     For i = 1 to problength
           Where   represents the maximum video quality (0 <I0< 4)   3.  prob[i] = e Q[s,i]/r  // Access Q[s,i] in the Q matrix

           at each video bit rate, f0 is optimal frame rate (1 <f0< 30)   4.  sum = sum + prob[i]
           maximizing the video quality at bit rate b,   isthe degree   5.  End

           of video quality robustness due to frame rate.         6.  For i = 1 to problength
             depends on packet loss robustness factor (        )and rate   7.  prob[i] = prob[i] / sum

           of packet loss (p) given by                            8.  End
                                                                  9.  Generate a random value ran, 0<ran<1 // pointer
                                                                     for random action selection
             =                                        (9)         10. For i = 1 to problength



                =    +            +                   (10)        11.   If ran>offset and ran<offset + prob[i]

           Here         represents the degree of video quality robustness   12.   selectedAction = i
           against packet loss. The  value of coefficients  v1,  v2,   13.   flag = 1
           v3,….v12 depends on type of  codec, video format, interval   14.  offset = offset + prob[i]
           between key frame, and size of video displayas mentioned   15. End
           in ITU-T G.1070.                                       16. If flag = 0
                                                                  17.   Repeat from step 9
                                                                  18. Else
                        4.PROPOSED ALGORITHM
                                                                  19.   Return  selectedAction
           4.1. SBQA USING SOFTMAX POLICY (SBQA-SP)
                                                                  4.2. SBQA USING  ε  GREEDY POLICY (SBQA-
                                                                  GP)
           This approach uses Softmax  exploration policy for action
           selection.The SBQA-SP algorithm is defined as follows.
                                                                  SBQA-GPas a variant of SBQA-SP algorithmuses  ε-
                                                                  greedy exploration policy  for action selection. The
               SBQA-SP Algorithm
                                                                  method for selecting action using this policy is defined
                                                                  below.
               1.  Initialize the number of packetsN, learning rate α
                  and discount factor  , last state slast and Q-matrix
                  Q.                                              ε-greedy (Q,s)
               2.  Compute throughput (Th)resulting from the
                  capture ofN packets.                            1.  Initialize fixed  probability  ε  and max // Store
                                                                                          th
               3.  Identify current state scurbased on Thvalue.      maximum value (max) in s  row of Q-matrix.
               4.  Read the resolution  res, and frame per second  2.  Generate a random value ran in the range 0 to 1.
                                                                     If ran<ε
                                                                  3.
                  fpsfrom the header in streamed video.
                                                                        selectedAction = -1
               5.  Determine current action        based on the current  4.  Else
                                                                  5.
                  quality segment.
               6.  While scur<slast // till last state reached    6.    For i = 1 to Qlength// get Qlengthfrom Q-matrix
                                                                  7.
                                                                         If Q[s,i] >=max
               7.    Read the encoded bitrateb, and compute frame
                     loss percentage p.                           8.      selectedAction = i// action with max reward
                                                                          max = Q[s,i]
               8.    Calculate the reward (video quality)     using  9.  End

                                                                  10.
                     (4)
                                                          – 128 –
   139   140   141   142   143   144   145   146   147   148   149