Page 42 - Proceedings of the 2018 ITU Kaleidoscope
P. 42
2018 ITU Kaleidoscope Academic Conference
the system performance particularly in handling the real- at client. The server adjusts video resolution based on the
time applications like video streaming. The on-policy client feedback dynamically, the degradation of video
algorithms such as Sarsa and Double Sarsa [15] prove to be quality at receiver is observed due to scaling down of the
a better fit here because they learn the action-values at each original content at the server. The proposed solution could
step, depending solely on the states visited and action taken. be standardized considering its practical use in supporting
When rewards are stochastic, Double Sarsa adds significant adaptive streaming of video in IP network over wireless
amount of stability in the learning process at minor increase systems.
in computational cost, while providing a higher return in an
on-policy algorithm. 2. PROPOSED SYSTEM
In assessing the quality of adaptive audio-visual streaming 2.1 System Architecture
over reliable transport, the International
Telecommunication Union’s Telecommunication The proposed framework design imitates the client server
Standardization Sector (ITU-T) has recommended a video model where the server’s activity gets simplified on the cost
quality estimation model and tool in ITU-T P.1203.1 [16], of client’s expanded observation and analysis process. It is
which is a parametric bitstream-based quality assessment implemented on top of HTTP where the live (or stored)
method. This model is intended for client-side monitoring video is streamed from the server to the client connected
of encrypted/non-encrypted HTTP/TCP based video on through a 4G wireless network. The media content is
demand VoD / live streaming services. Mode-2 of operation encoded progressively utilizing ITU-T H.264 video codec
defined in P.1203.1 is intended for non-encrypted media [19] and then streamed to the client. Once the streaming
and requires an input of meta-data and up-to 2% of the starts, at the client side the proposed algorithm analyses the
media stream with a medium complexity. quality of the streamed media along with the network
conditions in order to calculate the decision parameters to
In this work, we propose a new algorithm based on RL be sent as feedback to the server. The server analyses the
approach, Double Sarsa to improve the quality of a live feedback and adjusts the video quality to deliver the
streaming video. In Double Sarsa, two estimates of the maximum achievable QoE that can be supported by the
action-value Q(s, a) are decoupled and updated against each currently available network bandwidth.
other in request to enhance the rate of learning in a domain
with a stochastic reward system. The system is Table 1– Test parameters as per ITU recommendations
characterized by a set of states and actions where the best
possible action is taken from the current state through a Standards Parameters Metrics
gradual learning process. It then calculates the reward in
terms of Mean Opinion Score (MOS) using the ITU-T Transmission Errors with packet loss
P.1203.1 framework and determines the resulting state of
the system. Two exploration policies: softmax and ε-greedy Frame rate 5fps to 30fps
are used separately to find the future action to be taken ITU-T H.264/AVC (MPEG-4
which is sent as feedback to the server and finally the Q- J.247 Video codec part10),VC-1,Windows
matrix is updated. In this approach the adaptation problem Media9, Real Video
is expressed as an optimization process with their proposed (RV10), MPEG-4 Part 2
internal QoE goal function.
Temporal Maximum of 2 seconds
In analyzing the performance of the proposed system, the errors
Double Sarsa based quality adaptation algorithm is Input video 20 seconds
implemented independently with softmax policy and ε - length
greedy, and the performance is compared with an existing ITU-T 240p: 75-150 kbps
QoE driven strategy with future information [17]. The P.1203.1 Video 360p: 220-450 kbps
algorithms are implemented on the top (OTT) of HTTP resolution / 480p: 375-750 kbps
while 4G wireless network are used to establish bitrate 720p: 1050-2100 kbps
connectivity between client and server. The video 1080p: 1875-12500 kbps
encoding/decoding were carried out dynamically in
accordance with test parameters defined in ITU-T J.247 [18]
and ITU-T P.1203.1 [16] as listed in Table 1. 2.2 Server Side Functions
The decoded video sequences at the receiver were The server at first obtains the live media content or the
compared with the original video transmitted by the server location of the stored video in memory and sets the media
for quality evaluation during experimentation. Full URL. It then uses the Java framework for the VLC media
Reference (FR) video quality metrics namely Peak Signal player (VLCJ) to set the appropriate encoding parameters to
to Noise Ratio (PSNR), Structural Similarity (SSIM), ensure continuous streaming of the video. The media player
Multi-Scale SSIM (MS-SSIM), and Video Quality Metrics object is initialised and the streaming begins through the
(VQM) are used to evaluate the quality of streaming video specified HTTP port. It then waits for the client’s feedback
– 26 –