Page 41 - Proceedings of the 2018 ITU Kaleidoscope
P. 41

DOUBLE SARSA BASED MACHINE LEARNING TO IMPROVE QUALITY OF VIDEO
                         STREAMING OVER HTTP THROUGH WIRELESS NETWORKS



                                    Dhananjay Kumar ; Narmathaa Logganathan ; Ved P. Kafle
                                                                         1
                                                   1
                                                                                     2
                            1 Department of Information Technology, Anna University, MIT Campus, Chennai
                              2 National Institute of Information and Communications Technology, Tokyo


                              ABSTRACT                        end goal is to provide smooth video streaming  services
                                                              even in networks with constrained available bandwidth.
           The adaptive streaming over HTTP is widely advocated to
           enhance the Quality of Experience (QoE)  in a bitrate   The underlying principle of DASH i.e., the HTTP Adaptive
           constrained IP network. However, most previous     Streaming (HAS), allows the video chunks to be served to
           approaches based on estimation of available link   clients utilizing standard HTTP servers in either live or on-
           bandwidth or fullness of media buffer tend to become   demand form. Upon change in network conditions, a client
           ineffective due to the variability of IP traffic patterns. In   can progressively  switch video versions  for the chunks to
           this  paper,  we propose a Double State-Action-Reward-  be downloaded to keep up persistent video playback. The
           State-Action (Sarsa) based  machine learning method to   dynamic adaptation  leads to  better Quality of Experience
           improve user QoE in IP network. The Pv video quality   (QoE). However, HAS does not specifically control the
           estimation  model  specified  in  ITU-T  P.1203.1  transmission rate of video data and it is completely
           recommendation is embedded in the learning process for   controlled by the TCP [6]. It also takes advantage of the
           the estimation of QoE. We have implemented the proposed   HTTP/TCP universal  usages, for example, HTTP-based
           Double Sarsa based adaptation method on the top of HTTP   delivery tackles NAT and firewall issues. Furthermore, it
           in a 4G wireless network and assessed the resulting quality   allows utilizing standard HTTP servers and caches for
           improvement by using full reference video quality metrics.   streaming the content; and a reliable transmission provide
           The results show that the proposed method outperforms an   by the TCP [7].
           existing approach and  can  be recommended in
           standardization of future audio-visual streaming services   In maximizing the end user QoE, the process of adaptation
           over wireless IP network. We observed the average   needs to consider a dynamic  management of  streaming
           improvement of 7% in PSNR and 25% in VQM during the   media which dictates the perceived quality of the displayed
           live streaming of video.                           contents. However, developing a robust prediction  model
                                                              for QoE considering reliability, accuracy, scalability, etc.
               Keywords  – Video streaming,  QoE, Machine     remains a challenge [8]. There is a tradeoff between
                 learning, Sarsa, Video quality measurements   available network resources and perceived QoE.  The
                                                              dynamic adaptation of coding rate of the requested video by
                         1.  INTRODUCTION                     transmission resources could mitigate this problem since
                                                              even reduction of coding rate is less critical to degradation
           The video streaming applications are dominating IP   of QoE than the other parameters such as packet loss and
           networks over last few  years and it is continuously   delay [9]. The solution also needs  to consider the
           expanding.  As per the Cisco Visual Network Index,   requirement of standard process in supporting streaming of
           globally 82% of the consumer internet traffic will be video   audio-visual services over IP networks globally.
           by 2021, an increase of 73% from 2016 [1]. Further, the
           mobile data traffic is expected to increase seven times   In a challenging situation where prediction modeling faces
           between 2016 and 2021.  The Ericsson  mobility report   several limitations, Reinforcement Learning (RL) provides
           (November 2017) forecasts that there would be one billion   a promising technique to be incorporated in the system as
           subscribers for 5G  mobile broadband in 2023 [2].  It is   an elegant and practical solution. However, the large state
           recommended that the future IP networks should not only   space of Markov decision  process in these techniques
           be strong and resilient, but  also support interoperability   becomes a  major design challenge [10]. Under RL, the
           based on open standards with global reach [3]. In order to   policy in Q-learning is governed by the selection of state-
           handle the surging video streaming traffic, the 3GPP and   action pair, associated reward, and updating rule. But for
           MPEG have proposed Dynamic  Adaptive Streaming over   convergence, all pairs need to be updated [11]. Further, in
           HTTP (DASH) [4]. In DASH a video is split into a number   some stochastic environment, the overestimation in Q-
           of chunks of equal duration and each segment is encoded   learning  slows down the learning process [12].The
           with multiple version of quality and hence bitrate [5]. The   complexity of the advanced algorithms like Deep Q-
                                                              learning [13] and Double Q-learning [12-14] could inflict





          978-92-61-26921-0/CFP1868P-ART ã 2018 ITU‎      – 25 –                                      Kaleidoscope
   36   37   38   39   40   41   42   43   44   45   46