Page 59 - ITU Journal Future and evolving technologies Volume 2 (2021), Issue 4 – AI and machine learning solutions in 5G and future networks
P. 59
ITU Journal on Future and Evolving Technologies, Volume 2 (2021), Issue 4
10. CONCLUSIONS AND FUTURE WORK [7] Chaochao Lu and X. Tang. “Surpassing Human‑
Level Face Veri ication Performance on LFW with
We have seen that the proposed video streaming pipeline GaussianFace”. In: AAAI. 2015.
with multimodal adaptive normalization‑based architec‑
ture to generate the video helps in reducing the network [8] Z. Huang, Xiaowei Zhao, S. Shan, R. Wang, and X.
bandwidth in unreliable Internet conditions. The pro‑ Chen. “Coupling Alignments with Recognition for
posed video streaming pipeline can control the quality Still‑to‑Video Face Recognition”. In: 2013 IEEE In‑
of experience based on the compute resource and band‑ ternational Conference on Computer Vision (2013),
width availability. It helps in data privacy by synthesizing pp. 3296–3303.
the video on the avatar of that person. [9] N. Kumar, P. Belhumeur, and S. Nayar. “FaceTracer:
Although this implementation provides a proof of con‑ A Search Engine for Large Collections of Images
cept for the underlying idea, further work is needed to with Faces”. In: ECCV. 2008.
implement a full body low latency, low bandwidth video [10] I. Goodfellow, Jean Pouget‑Abadie, Mehdi Mirza,
streaming environment to further enhance the quality of Bing Xu, David Warde‑Farley, S. Ozair, Aaron C.
experience. With the rapid improvement of hardware ca‑ Courville, and Yoshua Bengio. “Generative Adver‑
pabilities in mobiles and personal computers, this is un‑ sarial Nets”. In: NIPS. 2014.
likely to be a major obstacle. As evidenced by the recent
announcement of the NVIDIA Maxine project [3], hurdles [11] Sergey Ioffe and Christian Szegedy. “Batch Normal‑
are surmountable and these ideas can be translated into ization: Accelerating Deep Network Training by
a practical system that provides immense gains over the Reducing Internal Covariate Shift”. In: (Feb. 2015).
conventional methods. [12] Nitish Srivastava, Geoffrey E. Hinton, A. Krizhevsky,
Ilya Sutskever, and R. Salakhutdinov. “Dropout: a
REFERENCES simple way to prevent neural networks from over‑
itting”. In: J. Mach. Learn. Res. 15 (2014), pp. 1929–
[1] Sadjad Fouladi, John Emmons, Emre Orbay, Cather‑ 1958.
ine Wu, Riad S. Wahby, and Keith Winstein. “Sal‑
sify: Low‑Latency Network Video through Tighter [13] Dmitry Ulyanov, Andrea Vedaldi, and Victor Lem‑
Integration between a Video Codec and a Trans‑ pitsky. “Instance Normalization: The Missing In‑
port Protocol”. In: 15th USENIX Symposium on gredient for Fast Stylization”. In: (July 2016).
Networked Systems Design and Implementation [14] Xun Huang and Serge Belongie. “Arbitrary Style
(NSDI 18). Renton, WA: USENIX Association, Apr. Transfer in Real‑Time with Adaptive Instance Nor‑
2018, pp. 267–282. ISBN: 978‑1‑939133‑01‑4. URL: malization”. In: Oct. 2017, pp. 1510–1519. DOI: 10.
https : / / www . usenix . org / conference / 1109/ICCV.2017.167.
nsdi18/presentation/fouladi.
[15] Hyeonseob Nam and Hyo‑Eun Kim. Batch‑Instance
[2] Ian J. Goodfellow, Jean Pouget‑Abadie, Mehdi Normalization for Adaptively Style‑Invariant Neural
Mirza, Bing Xu, David Warde‑Farley, Sherjil Ozair, Networks. May 2018.
Aaron Courville, and Yoshua Bengio. Generative [16] Junho Kim, Minjae Kim, Hyeon‑Woo Kang, and
Adversarial Networks. 2014. arXiv: 1406 . 2661 Kwanghee Lee. U‑GAT‑IT: Unsupervised Generative
[stat.ML].
Attentional Networks with Adaptive Layer‑Instance
[3] Sid Sharma. “AI Can See Clearly Now: GANs Take Normalization for Image‑to‑Image Translation. July
the Jitters Out of Video Calls”. In: NVIDIA Blog (Aug. 2019.
2020).
[17] Jimmy Ba, Jamie Kiros, and Geoffrey Hinton. “Layer
[4] Neeraj Kumar, Srishti Goel, Ankur Narang, and Mu‑ Normalization”. In: (July 2016).
jtaba Hasan. “Robust One Shot Audio to Video Gen‑ [18] Taesung Park, Ming‑Yu Liu, Ting‑Chun Wang, and
eration”. In: Proceedings of the IEEE/CVF Confer‑ Jun‑Yan Zhu. “Semantic Image Synthesis with
ence on Computer Vision and Pattern Recognition Spatially‑Adaptive Normalization”. In: Proceedings
(CVPR) Workshops. June 2020.
of the IEEE Conference on Computer Vision and Pat‑
[5] Johannes L. Schönberger and Jan‑Michael Frahm. tern Recognition. 2019.
“Structure‑from‑Motion Revisited”. In: 2016 IEEE [19] Arun Mallya, Ting‑Chun Wang, Karan Sapra, and
Conference on Computer Vision and Pattern Recog‑ Ming‑Yu Liu. World‑Consistent Video‑to‑Video Syn‑
nition (CVPR) (2016), pp. 4104–4113.
thesis. July 2020.
[6] S. Ma, Xinfeng Zhang, Chuanmin Jia, Zhenghui Zhao,
[20] Peter de Rivaz and Jack Haughton. “Av1 bitstream
S. Wang, and Shanshe Wang. “Image and Video
& decoding process speci ication”. In: The Alliance
Compression With Neural Networks: A Review”. In:
for Open Media (2018), p. 182.
IEEE Transactions on Circuits and Systems for Video
Technology 30 (2020), pp. 1683–1698. [21] Versatile Video Coding (VVC). https://jvet.hhi.
fraunhofer.de/. Accessed: 2020‑10‑27.
© International Telecommunication Union, 2021 43