Page 144 - Proceedings of the 2017 ITU Kaleidoscope
P. 144
2017 ITU Kaleidoscope Academic Conference
3.2. Video Quality Estimation using No-reference Metric 9. Estimate the current throughputThcur and based
on Thcur identify new state snew.
The ITU-T G.1070 defines a model [15] for estimating the 10. Compute new action anew← SoftMax(Q,s).
video quality based on measurable parameters of IP // Exploration policy function to get best
network. The video quality (Vq) is representedas possible future action
=1 + (4) 11. Update the , based on (1)
Where represents the basic video quality resulting from 12. ← //Update new state to the current
the encoding distortion due to the combined effect of bit state
rate and frame rate, is the factor governed by degree of 13. ← //Update new action to the current
robustness due to packet loss. action
is expressed in terms of bit rate(b)and frame rate 14. Assign action as feedback to the server
(f)according to equations (5 - 8) as follows. 15. End
16. Go to Step 2 and continue till streaming occurs
( ( ) ( )) 17. End
= (5)
= + (6) Softmax (Q,s)
= + (7)
1. Initialize r = 1, offset = 0, sum = 0, flag = 0 and
= 1 − (8) prob[] = {0}, problength = length of prob[]
2.
For i = 1 to problength
Where represents the maximum video quality (0 <I0< 4) 3. prob[i] = e Q[s,i]/r // Access Q[s,i] in the Q matrix
at each video bit rate, f0 is optimal frame rate (1 <f0< 30) 4. sum = sum + prob[i]
maximizing the video quality at bit rate b, isthe degree 5. End
of video quality robustness due to frame rate. 6. For i = 1 to problength
depends on packet loss robustness factor ( )and rate 7. prob[i] = prob[i] / sum
of packet loss (p) given by 8. End
9. Generate a random value ran, 0<ran<1 // pointer
for random action selection
= (9) 10. For i = 1 to problength
= + + (10) 11. If ran>offset and ran<offset + prob[i]
Here represents the degree of video quality robustness 12. selectedAction = i
against packet loss. The value of coefficients v1, v2, 13. flag = 1
v3,….v12 depends on type of codec, video format, interval 14. offset = offset + prob[i]
between key frame, and size of video displayas mentioned 15. End
in ITU-T G.1070. 16. If flag = 0
17. Repeat from step 9
18. Else
4.PROPOSED ALGORITHM
19. Return selectedAction
4.1. SBQA USING SOFTMAX POLICY (SBQA-SP)
4.2. SBQA USING ε GREEDY POLICY (SBQA-
GP)
This approach uses Softmax exploration policy for action
selection.The SBQA-SP algorithm is defined as follows.
SBQA-GPas a variant of SBQA-SP algorithmuses ε-
greedy exploration policy for action selection. The
SBQA-SP Algorithm
method for selecting action using this policy is defined
below.
1. Initialize the number of packetsN, learning rate α
and discount factor , last state slast and Q-matrix
Q. ε-greedy (Q,s)
2. Compute throughput (Th)resulting from the
capture ofN packets. 1. Initialize fixed probability ε and max // Store
th
3. Identify current state scurbased on Thvalue. maximum value (max) in s row of Q-matrix.
4. Read the resolution res, and frame per second 2. Generate a random value ran in the range 0 to 1.
If ran<ε
3.
fpsfrom the header in streamed video.
selectedAction = -1
5. Determine current action based on the current 4. Else
5.
quality segment.
6. While scur<slast // till last state reached 6. For i = 1 to Qlength// get Qlengthfrom Q-matrix
7.
If Q[s,i] >=max
7. Read the encoded bitrateb, and compute frame
loss percentage p. 8. selectedAction = i// action with max reward
max = Q[s,i]
8. Calculate the reward (video quality) using 9. End
10.
(4)
– 128 –