Page 24 - Kaleidoscope Academic Conference Proceedings 2022
P. 24

can be considered “consciousness within a restricted domain”  As we showed in [11], all XR communication systems include
           [7]: You feel, think and behave as if the place were real,  one or several of these building blocks, and the combination
           even though you know it is not. This property is key in the
                                                              of blocks they address has a strong impact on the technical
 IEEE TRANSACTIONS ON AFFECTIVE COMPUTING VOL. XX, NO. X, XXX XXXX  success of immersive technologies, since consciousness is  architecture they implement. In the next subsection we will  4
                                                              describe how we are addressing each of them, with the target
           a very primitive cognitive function, not associated with any
           cortical region [8]. In other words, the perceptual illusion of  of building a distributed reality system which is able to
           “being there” created by XR technology really addresses the  provide all of them simultaneously.
           deepest layers of the human brain, thus making it universally
           applicable to any kind of person.                  2.1  Visit: remote presence
           This perceptual illusion of presence involves different factors
           [9]. Biocca et al. identified three key components [10]:  To address the problem of immersive remote presence we
           spatial presence, the illusion of being in a different place;  have developed The Owl, a prototype 360-degree video-based
           social presence, the illusion of being and interacting with  telepresence system. 360-degree videos are video recordings
           other people; and self-presence, the illusion of having one’s  where a view in every direction is recorded at the same
           body integrated in the virtual experience.  Creating the  time, shot using an omnidirectional camera or a collection
           realverse means addressing those components by transmitting  of cameras. Our prototype, shown in Figure 2, consists of a
           or generating this sense of presence.  Technologicaly, it  commercial omnidirectional video camera, a control system
           requires implementing the four perceptual building blocks  on a Raspberry Pi 4, a backend in the cloud, and a client for
           shown in Figure 1 [11]:                            Meta Quest developed in the 3-D version of the game engine
                                                              Unity. The system allows video transmission in real time,
                                                              in equirectangular projection1, with 4K resolution (8 Mbps)
                                                              and conversational delay (< 500 ms), as well as audio in both
                                                              directions. The approach is similar to other prototypes in the
                                  Spatial
                                 Presence                     state of the art, such as ExLeap [13] or Show Me Around [14].
           Meet                                        Visit


           Face            Social        Self         Move
                          Presence      Presence





           Figure 1 – Building blocks (or fundamental elements) of XR
           communication systems: face, visit, meet, and move [11].


            1. 1Face is
                 © 2022 Nokia the property of the system to transmit in real
               time a visual representation of the other person, e.g.
               through a video-conferencing system.  This element                Raspberry
                                                                                   Pi 4
               enables visual communication. Seeing the other person
               is key to transmitting non-verbal communication cues,
               including showing objects of the personal space.

            2. Visit is the property of the system to transmit in real time
               a visual representation of the surroundings of the other
               person. This enables remote presence: the sense of
 (a) Coffee shop  (b) International office  person and being able to operate and discuss about it. (c) Study in Spain
                                                              Figure 2 – The Owl: an immersive communication system
               “being there”, in the physical environment of the remote
                                                              prototype (bottom left), which allows a person wearing a
 Fig. 2. Video sources screenshots  3. Meet is the property of the system to represent the other  head-mounted display (bottom right) to feel present in a
                                                              remote location and interact with the people there (top).
               person in the same (virtual or physical) space as the user.
                                                              The Owl has been
                                                                       © 2022 Nokia field-tested in different scenarios, and
                                                                   2
               This enables shared immersion: being immersed in the
 TABLE 2       same (virtual or physical) environment and interacting  has allowed us to evaluate 360 degree video technology for
                                                              real-time communications in use cases such as education [15]
               with the same (virtual or physical) objects.
                                                              or hybrid conferences, with face-to-face and virtual attendees
 Test content characteristics  4. Move is the property of the system itself to represent  [16]. For better assessment of its QoE, a novel methodology
               the user within it and enable its embodied interaction.
                                                              1 The equirectangular projection maps the longitude and latitude of
               It means that the actions of the users are represented
               within the system and allow the user to interact with it.  the sphere videos to the horizontal and vertical coordinates of the
                                                               rectangular video [12].
 Name  Genre  Perspective-taking  Description              – xx –




 A coffee conversation between foreign and

 Coffee shop  Everyday conversation  Observer

 local students about cultural differences




 A presentation given by a professor to

 International office  Educational  Actor

 students about the foreign application process




 A conversation about the differences between

 Study in Spain  Discussion  Actor

 transport and rental prices in different countries







 periments with two HMDs: Samsung GearVR and  360VR videos with fluctuations of quality, simulating a VR





 Lenovo Mirage Solo.  streaming communication.




 Presence questionnaire scores, specifically TPI (Lom-

 •  The contents used in the experiment showed simulated


 bard et al.) and PQ (Witmer & Singer), obtained from  conversations around a common topic: international experi-




 48 participants.  ences, i.e. working or studying abroad.




 Statistical analysis notebook.


 •
 The main idea behind choosing this specific context was



 The conclusions were used on the selection of the test  our ability to gather a balanced sample of people who have





 material with a longer duration to increase the immersive  had international experiences and with people who have




 experience, the HMD and the evaluation method (touchpad  not. We acquired three contents with different acquisition





 or handheld controller), and the methodology for video  perspectives (actor and observer), each one on a different




 quality and presence. Specifically, the fact that the fatigue  genre (everyday conversation, educational, and discussion)




 effect detected during the pilot study may affect the eval-  about international experiences. For that, student volun-





 uation of socioemotional features is a motivation for the  teers were recruited for the recordings, both exchange and




 methodology for video quality evaluation proposed in this  national students from the university, making the conver-





 experiment. Additionally, we present a comparison of the  sations more realistic and fluent. Conversations were in




 scores obtained during the video quality evaluation of the  English, making the experiment accessible to different na-




 pilot study and the experiment explained in detail in this  tionalities and mother tongues and increasing the diversity





 paper.  of the sample.







 The experiment considered three test conditions, sum-





 3  WORK APPROACH  marized in Table 1, and each participant was assigned a




 condition. However, in all conditions, participants visual-

 Based on the previous analysis, we pose the following  ized the same Processed Video Sequences (PVSs). After each





 Research Questions (RQs):

 video, they were requested to rate its visual quality, as well





 RQ1: Is it possible to evaluate video quality in videos  as to evaluate the socioemotional features of interest: empa-

 •

 of long duration designed for the evaluation of so-  thy and changes in attitude, spatial and social presence, and




 cioemotional features?  attention.





 RQ2: Which technical aspects, such as the position

 •  Participants assigned to condition A had the additional

 of the camera, the type of conversation, the video  task of periodically rating the visual quality of the video




 quality, the acquisition perspective, etc., influence  during its playback, whenever its quality changed. This is a





 socioemotional features?  conventional design to evaluate the subjective quality of the




 RQ3: Which interactive elements can be provided to

 •  video sequence under different intensities of impairment.


 the remote client to improve some socialemotional  However, this focused task might have impact on the eval-




 concepts such as presence or attention?  uation of socioemotional features compared to the baseline








 To answer these RQs, we designed a subjective ex-  scenario without the task (condition B).




 periment where an immersive communication between a  Finally, participants in condition C were provided with





 provider and a remote client was simulated, presented in  an additional interactivity element: the possibility to see




 Figure 1. At the provider side, a conversation among several  their own hands and take notes about the conversation. We





 people took place, and the remote client attended virtually  hypothesize that this could enhance socioemotional features




 wearing an HMD. In the subjective test, the observer took  such as presence and attention with respect to the other




 the role of the remote client and visualized pre-recorded  conditions.
   19   20   21   22   23   24   25   26   27   28   29