The official banner for the IEEE Conference on Virtual Reality + User Interfaces, comprised of a Kiwi wearing a VR headset overlaid on an image of Mount Cook and a braided river.

Papers

Monday, 18 March 2024 (Timezone: Orlando, Florida USA UTC-4)
MO1G Tracking and Motion Capture 14:00‑15:00 Fantasia Ballroom G
MO1H Social Applications 14:00‑15:00 Fantasia Ballroom H
MO1J Embodiment, Avatars and Presence 14:00‑15:00 Fantasia Ballroom J
MO2G Psycho- and Sociocultural Dimensions of Virtual Identities 15:30‑17:00 Fantasia Ballroom G
MO2H Multimodal Input and Interaction 15:30‑17:00 Fantasia Ballroom H
MO2J Perception and Cognition 15:30‑17:00 Fantasia Ballroom J
Tuesday, 19 March 2024 (Timezone: Orlando, Florida USA UTC-4)
TU1G 3D Interaction and Touch 8:30‑9:45 Fantasia Ballroom G
TU1H Multisensory Interfaces 8:30‑9:45 Fantasia Ballroom H
TU1J Evaluating Immersion: UX and Interaction 8:30‑9:45 Fantasia Ballroom J
TU2G Locomotion and Redirection 13:30‑15:00 Fantasia Ballroom G
TU2H Projections 13:30‑15:00 Fantasia Ballroom H
TU2J 3D Interaction and Teleoperation 13:30‑15:00 Fantasia Ballroom J
TU3G 3D Interaction and Modality 15:30‑17:00 Fantasia Ballroom G
TU3H Perception in Navigation, Locomotion and Redirection 15:30‑17:00 Fantasia Ballroom H
TU3J User Experience 15:30‑17:00 Fantasia Ballroom J
Wednesday, 20 March 2024 (Timezone: Orlando, Florida USA UTC-4)
WE1G 360 Video 8:30‑9:45 Fantasia Ballroom G
WE1H Immersive Analytics and Visualization 8:30‑9:45 Fantasia Ballroom H
WE1J Industrial and Sports Applications 8:30‑9:45 Fantasia Ballroom J
WE2G 3D Authoring 10:15‑11:15 Fantasia Ballroom G
WE2H Gaze 10:15‑11:15 Fantasia Ballroom H
WE2J Collaboration 10:15‑11:15 Fantasia Ballroom J
WE3G Haptics 11:30‑12:30 Fantasia Ballroom G
WE3H Healthcare Applications 11:30‑12:30 Fantasia Ballroom H
WE3J Human Factors and Ergonomics 11:30‑12:30 Fantasia Ballroom J
WE4G Perception in AR, MR and Near-Eye Displays 13:30‑15:00 Fantasia Ballroom G
WE4H Rendering and Displays 13:30‑15:00 Fantasia Ballroom H
WE4J Experiences, Cybersickness and Presence 13:30‑15:00 Fantasia Ballroom J
Thursday, 21 March 2024 (Timezone: Orlando, Florida USA UTC-4)
TH1G Distributed Systems and Telepresence 8:30‑9:45 Fantasia Ballroom G
TH1H Multimodal Perception and Experiences 8:30‑9:45 Fantasia Ballroom H
TH1J Depth and Distance Perception 8:30‑9:45 Fantasia Ballroom J
TH2G Touch, Tangible, and Gesture Interfaces 10:15‑11:15 Fantasia Ballroom G
TH2H Graphics and Crowds 10:15‑11:15 Fantasia Ballroom H
TH2J Ethics in VR 10:15‑11:15 Fantasia Ballroom J
TH3G Modeling and Simulation 11:30‑12:30 Fantasia Ballroom G
TH3H Software 11:30‑12:30 Fantasia Ballroom H
TH3J Localization and Tracking 11:30‑12:30 Fantasia Ballroom J
TH4G Education Applications 13:30‑15:00 Fantasia Ballroom G
TH4H Virtual Interaction and Embodiment 13:30‑15:00 Fantasia Ballroom H
TH4J Locomotion and Navigation 13:30‑15:00 Fantasia Ballroom J

Session: Tracking and Motion Capture (MO1G)

Date & Time: Monday, 18 March 2024, 14:00-15:00 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom G
Session Chair: Manuela Chessa

Best Paper Award

Swift-Eye: Towards Anti-blink Pupil Tracking for Precise and Robust High-Frequency Near-Eye Movement Analysis with Event Cameras (Journal: P1220)

Tongyu Zhang, Shandong University; Yiran Shen, Shandong University; Guangrong Zhao, School of Software; Lin Wang, HKUST, GZ; Xiaoming Chen, Beijing Technology and Business University; Lu Bai, Shandong University; Yuanfeng Zhou, Shandong University

In this paper, we propose Swift-Eye, an offline precise and robust pupil estimation and tracking framework to support high-frequency near-eye movement analysis, especially when the pupil region is partially occluded. Swift-Eye is built upon the emerging event cameras to capture the high-speed movement of eyes in high temporal resolution. Then, a series of bespoke components are designed to generate high-quality near-eye movement video at a high frame rate over kilohertz and deal with the occlusion over the pupil caused by involuntary eye blinks. According to our extensive evaluations on EV-Eye, a large-scale public dataset for eye tracking using event cameras, Swift-Eye shows high robustness against significant occlusion.

GBOT: Graph-Based 3D Object Tracking for Augmented Reality-Assisted Assembly Guidance (Conference: P1136)

Shiyu Li, Technical University of Munich; Hannah Schieber, Friedrich-Alexander University; Niklas Corell, Friedrich-Alexander University Erlangen-Nürnberg; Bernhard Egger, Friedrich-Alexander-Universität Erlangen-Nürnberg; Julian Kreimeier, Technical University of Munich; Daniel Roth, Technical University of Munich

Guidance for assemblable parts is a promising field for the use of augmented reality. Augmented reality assembly guidance requires 6D object poses of target objects in real-time. To address this problem, we present Graph-based Object Tracking (GBOT), a novel graph-based single-view RGB-D tracking approach. The real-time markerless multi-object tracking is initialized via 6D pose estimation and updates the graph-based assembly poses. By utilizing relative poses of the individual assembly parts, we update the multi-state assembly graph. Quantitative experiments in synthetic data and further qualitative study in real test data show that GBOT can outperform existing work towards enabling context-aware augmented reality assembly guidance.

Best Presentation Honorable Mention

BOXRR-23: 4.7 Million Motion Capture Recordings from 105,000 XR Users (Journal: P1654)

Vivek C Nair, UC Berkeley; Wenbo Guo, Purdue University; Rui Wang, Carnegie Mellon University; James F. O'Brien, UC Berkeley; Louis Rosenberg, Unanimous AI; Dawn Song, UC Berkeley

Extended reality (XR) devices such as the Meta Quest and Apple Vision Pro have seen a recent surge in attention, with motion tracking "telemetry" data lying at the core of nearly all XR and metaverse experiences. Researchers are just beginning to understand the implications of this data for security, privacy, usability, and more, but currently lack large-scale human motion datasets to study. The BOXRR-23 dataset contains 4,717,215 motion capture recordings, voluntarily submitted by 105,852 XR device users from over 50 countries. BOXRR-23 is over 200 times larger than the largest existing motion capture research dataset and uses a new, highly efficient and purpose-built XR Open Recording (XROR) file format.

Pose-Aware Attention Network for Flexible Motion Retargeting by Body Part (Invited Journal: P3005)

Lei Hu, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China & University of Chinese Academy of Sciences, Beijing, China; Zihao Zhang, Institute of Computing Technology, Chineses Academy of Sciences, Beijing, China; Chongyang Zhong, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China & University of Chinese Academy of Sciences, Beijing, China; Boyuan Jiang, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China & University of Chinese Academy of Sciences, Beijing, China; Shihong Xia, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China & University of Chinese Academy of Sciences, Beijing, China

Motion retargeting is a fundamental problem in computer graphics and computer vision. Existing approaches usually have many strict requirements, such as the source-target skeletons needing to have the same number of joints or share the same topology. To tackle this problem, we note that skeletons with different structure may have some common body parts despite the differences in joint numbers. Following this observation, we propose a novel, flexible motion retargeting framework. The key idea of our method is to regard the body part as the basic retargeting unit rather than directly retargeting the whole body motion. To enhance the spatial modeling capability of the motion encoder, we introduce a pose-aware attention network (PAN) in the motion encoding phase. The PAN is pose-aware since it can dynamically predict the joint weights within each body part based on the input pose, and then construct a shared latent space for each body part by feature pooling. Extensive experiments show that our approach can generate better motion retargeting results both qualitatively and quantitatively than state-of-the-art methods. Moreover, we also show that our framework can generate reasonable results even for a more challenging retargeting scenario, like retargeting between bipedal and quadrupedal skeletons because of the body part retargeting strategy and PAN. Our code is publicly available.

Session: Social Applications (MO1H)

Date & Time: Monday, 18 March 2024, 14:00-15:00 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom H
Session Chair: Mary Whitton

Designing and Evaluating a VR Lobby for a socially enriching remote Opera watching experience (Journal: P1019)

Sueyoon Lee, Centrum Wiskunde & Informatica (CWI); Irene Viola, Centrum Wiskunde & Informatica (CWI); Silvia Rossi, Centrum Wiskunde & Informatica (CWI); Zhirui Guo, Centrum Wiskunde & Informatica (CWI); Ignacio Reimat, Centrum Wiskunde & Informatica (CWI); Kinga Ławicka, Centrum Wiskunde & Informatica (CWI); Alina Striner, Centrum Wiskunde & Informatica (CWI); Pablo Cesar, Centrum Wiskunde & Informatica (CWI)

In this paper, we design, implement, and evaluate a VR theatre lobby as a dynamic space for remote users to communicate and interact following their virtual opera experiences. We conducted an initial test with paired experts (N=10) in a highly realistic representation using our VR lobby prototype, developed based on the theoretical design concept. After refining the prototype for better usability and user experience, we ran a between-subject controlled study (N=40) to compare individuals' and pairs' user experience. The results of our mixed-methods analysis reveal the strength of our social VR lobby in connecting with other users, consuming the opera more deeply, and exploring new possibilities beyond what is common in real life.

Springboard, Roadblock, or "Crutch"?: How Transgender Users Leverage Voice Changers for Gender Presentation in Social Virtual Reality (Conference: P1377)

Kassie C Povinelli, University of Wisconsin-Madison; Yuhang Zhao, University of Wisconsin-Madison

Social virtual reality (VR) serves as a vital platform for TGNC individuals to explore their identities through avatars and build online communities. However, it presents a challenge: the disconnect between avatar embodiment and voice representation, often leading to misgendering and harassment. We interviewed 13 transgender and gender-nonconforming users, finding that using a voice changer not only reduces voice-related harassment, but also allows them to experience gender euphoria through their modified voice, motivating them to pursue voice training and medication to achieve desired voices. Furthermore, we identified technical barriers and possible improvements to voice changer technology for TGNC individuals.

Enhancing Positive Emotions through Interactive Virtual Reality Experiences: An EEG-Based Investigation (Conference: P1475)

Shiwei Cheng, Zhejiang University of Technology; Sheng Danyi, Computer Science; Yuefan Gao, Cyborg Intelligence Technology Company; Zhanxun DONG, Shanghai Jiao Tong University; Ting Han, Shanghai Jiao Tong University

Virtual reality (VR) holds potential to promote feelings of well-being by evoking positive emotions. Our study aimed to investigate the types of interaction behaviors in VR that effectively enhance positive emotions. An exploratory study (N = 22) was conducted on a virtual museum to study the impact of varying user autonomy and interaction behaviors on emotions. An individual emotion model based on electroencephalogram (EEG) was employed to predict the promotion of positive emotions. The results indicated that incorporating creative interaction functions increased positive emotions, with the extent of increase closely linked to the degree of user autonomy.

Self-Guided DMT: Exploring a Novel Paradigm of Dance Movement Therapy in Mixed Reality for Children with ASD (Journal: P1931)

Weiying Liu, School of Mechanical,Electrical&Information Engineering; Yanyan Zhang, Weihai Maternal & Child Health Care Hospital; Baiqiao Zhang, Shandong University; Qian qian Xiong, Shandong University; Hong Zhao, Shandong University; Sheng Li, Peking University; Juan Liu, Shandong University; Yulong Bian, School of Machanical, Electrical & Information Engineering

Children with Autism Spectrum Disorder (ASD) often face motor challenges. Traditional Dance Movement Therapy (DMT) lacks effectiveness. We propose Mixed Reality DMT with interactive virtual agents, offering immersive content and feedback. Our novel self-guided training paradigm creates virtual twin agents that resemble children with ASD using a single photo, aiding them during training. In an experiment involving 24 children with ASD, self-guidance through the twin agent significantly improved training performance, particularly in movement quality and target-related responses. This approach has clinical potential in medical treatment and rehabilitation for children with ASD.

Session: Embodiment, Avatars and Presence (MO1J)

Date & Time: Monday, 18 March 2024, 14:00-15:00 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom J
Session Chair: Rebecca Fribourg

Measuring Embodiment: Movement Complexity and the Impact of Personal Characteristics (Invited Journal: P3004)

Tabitha C. Peck, Davidson College, Davidson, NC, USA; Jessica J. Good, Davidson College, Davidson, NC, USA

A user's personal experiences and characteristics may impact the strength of an embodiment illusion and affect resulting behavioral changes in unknown ways. This paper presents a novel re-analysis of two fully-immersive embodiment user-studies (n=189 and n=99) using structural equation modeling, to test the effects of personal characteristics on subjective embodiment. Results demonstrate that individual characteristics (gender, participation in science, technology, engineering or math - Experiment 1, age, video gaming experience - Experiment 2) predicted differing self-reported experiences of embodiment Results also indicate that increased self-reported embodiment predicts environmental response, in this case faster and more accurate responses within the virtual environment. Importantly, head-tracking data is shown to be an effective objective measure for predicting embodiment, without requiring researchers to utilize additional equipment.

Embodying a self-avatar with a larger leg: its impacts on motor control and dynamic stability (Journal: P1418)

Valentin Vallageas, Imaging and orthopedics research laboratory; Rachid Aissaoui, CHUM research center; Iris Willaert, Ecole de technologie superieure; David Labbe PhD, Ecole de technologie superieure

Several studies demonstrate that virtual reality users can embody avatars with altered morphologies, adapting their mental body map (body schema) crucial for planning movements. This study explores how embodying avatars with enlarged legs affects motor planning. Thirty participants embodied avatars with different leg sizes, combined with two different embodiment levels. Gait initiation tasks showed no significant biomechanical changes. Asynchronous stimuli reduced embodiment without affecting stability measures. Deforming avatars might subtly influence motor execution in rehabilitation. The study suggests the adaptability of the body schema to morphological modifications, with implications for individuals with impaired mobility.

Human Factors at Play: Understanding the Impact of Conditioning on Presence and Reaction Time in Mixed Reality (Journal: P1076)

Yasra Chandio, University of Massachusetts, Amherst; Victoria Interrante, University of Minnesota; Fatima Muhammad Anwar, UMASS Amherst

This study investigates the link between presence in mixed reality and reaction time, focusing on psychological and physiological human aspects. Presence, usually gauged using subjective surveys, is now found to align with objective metrics (reaction time). Our research delves into how human conditioning impacts this relationship. An exploratory study involving 60 users under varied conditioning scenarios (control, positive, negative) discovered a notable correlation (-0.64) between presence scores and reaction times, suggesting that the impact of human factors on reaction time correlates with its effect on presence. Our study takes another critical step toward using objective and systemic measures like reaction time as a presence measure.

Age and Realism of Avatars in Simulated Augmented Reality: Experimental Evaluation of Anticipated User Experience (Conference: P1922)

Veronika Mikhailova, Technische Universität Ilmenau; Christoph Gerhardt, Technische Universität Ilmenau; Christian Kunert, Technische Universität Ilmenau; Tobias Schwandt, Technische Universität Ilmenau; Florian Weidner, Technische Universität Ilmenau; Wolfgang Broll, Technische Universität Ilmenau; Nicola Döring, Technische Universität Ilmenau

The study investigates the social attractiveness of avatars in simulated augmented reality (AR) based on a sample of N=2086 age-diverse participants from Germany. In an online setting, participants evaluated avatars representing different age groups (younger, middle-aged, older) and levels of realism (low, medium, high). Results demonstrated a strong preference for younger, high-realism avatars as communication partners and for self-representation in AR. However, older adults showed a tendency to opt for avatars resembling their actual age. The study provides insights into social interactions in AR, highlighting age-related stereotypes in avatar-based communication and underscoring the need for a more inclusive avatar design.

Session: Psycho- and Sociocultural Dimensions of Virtual Identities (MO2G)

Date & Time: Monday, 18 March 2024, 15:30-17:00 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom G
Session Chair: Eric Hodgson

On the Emergence of Symmetrical Reality (Conference: P1098)

Zhenliang Zhang, Beijing Institute for General Artificial Intelligence; Zeyu Zhang, Beijing Institute for General Artificial Intelligence; Ziyuan Jiao, Beijing Institute for General Artificial Intelligence; Yao Su, Beijing Institute for General Artificial Intelligence; Hangxin Liu, Beijing Institute for General Artificial Intelligence; Wei Wang, Beijing Institute for General Artificial Intelligence; Song-Chun Zhu, Beijing Institute for General Artificial Intelligence

In this paper, we introduce the symmetrical reality framework, which offers a unified representation encompassing various forms of physical-virtual amalgamations. This framework enables researchers to better comprehend how AI agents can collaborate with humans and how distinct technical pathways of physical-virtual integration can be consolidated from a broader perspective. We then delve into the coexistence of humans and AI, demonstrating a prototype system that exemplifies the operation of symmetrical reality systems for specific tasks, such as pouring water. Finally, we propose an instance of an AI-driven active assistance service that illustrates the potential applications of symmetrical reality.

Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters (Conference: P1518)

Zechen Bai, National University of Singapore; Peng Chen, Institute of Software, Chinese Academy of Sciences; Xiaolan Peng, Institute of software,Chinese Academy of Sciences; Lu Liu, Institute of Software, Chinese Academy of Sciences; Naiming Yao, Institute of Software, Chinese Academy of Sciences; Hui Chen, Institute of Software, Chinese Academy of Sciences

This paper introduces a comprehensive approach to automatically generate facial animations for customized characters, irrespective of their blendshape topologies and texture appearances. The method involves estimating blendshape coefficients from input images or videos using a deep learning model. The proposed toolkit incorporates this model, featuring user-friendly interfaces and a human-in-the-loop scheme. Evaluation results demonstrate the flexibility to support personalized virtual character models. The toolkit facilitates easy and efficient facial animation generation, yielding satisfactory quality. Human-in-the-loop involvement enhances solution performance.

The Influence of Environmental Context on the Creation of Cartoon-like Avatars in Virtual Reality (Conference: P1737)

Pauline Bimberg, University of Trier; Michael Feldmann, Trier University; Benjamin Weyers, Trier University; Daniel Zielasko, University of Trier

The user study presented in this paper explores the effects that being immersed in different virtual scenes has on a user's avatar-design behavior. For this purpose, we have developed a character creation tool that lets users configure their appearance in Virtual Reality. This tool has then been employed in a user study involving 33 participants, who were asked to configure a virtual avatar in a beach and a hospital environment. Our results show that the environment that participants were immersed in influenced their design behavior, with the beach environment leading to a more extensive use of accessories than the hospital scene.

The Impact of Avatar Stylization on Trust (Conference: P1858)

Ryan Canales, Reality Labs Research, Meta; Doug Roble, Reality Labs Research, Meta; Michael Neff, Reality Labs Research, Meta

Virtual Reality (VR) allows people to choose any avatar to represent themselves. How does this choice impact social interaction that often relies on the establishment of trust? Are people more likely to trust a highly realistic avatar or is there flexibility in representation? This work presents a study exploring this question using a high stakes medical scenario. Participants meet three different doctors with three different style levels: realistic, caricatured, and an in-between ``Mid'' level. Trust ratings are largely consistent across style levels, but participants were more likely to select doctors with the ``Mid'' level of stylization for a second opinion. There is a clear preference against one of the three doctor identities.

Influence of Virtual Shoe Formality on Gait and Cognitive Performance in a VR Walking Task (Conference: P2102)

Sebastian Oberdörfer, University of Würzburg; Sandra Birnstiel, Friedrich-Alexander-Universität Erlangen-Nürnberg; Marc Erich Latoschik, University of Würzburg

Shoes come in various degrees of formality and their structure can affect human gait. In our study, we embody 39 participants with a generic avatar of the user's gender wearing three different pairs of shoes as within condition. The shoes differ in degree of formality. We measure the gait during a 2-minute walking task during which participants wore the same real shoe and assess selective attention using the Stroop task. Our results show significant differences in gait between the tested virtual shoe pairs. We found small effects between the three shoe conditions with respect to selective attention. However, we found no significant differences with respect to correct items and response time in the Stroop task.

Best Paper Honorable Mention

Stepping into the Right Shoes: The Effects of User-Matched Avatar Ethnicity and Gender on Sense of Embodiment in Virtual Reality (Journal: P1126)

Tiffany D. Do, University of Central Florida; Camille Isabella Protko, University of Central Florida; Ryan P. McMahan, University of Central Florida

In many consumer VR applications, users embody predefined characters that offer minimal customization options, frequently emphasizing storytelling over choice. We investigated if matching a user's ethnicity and gender with their virtual self-avatar affects their affects their sense of embodiment in VR. A 2x2 experiment with diverse participants (n=32) showed matching ethnicity increased overall sense embodiment, irrespective of gender, impacting sense of appearance, response, and ownership. Our findings highlight the significance of avatar-user alignment for a more immersive VR experience.

Session: Multimodal Input and Interaction (MO2H)

Date & Time: Monday, 18 March 2024, 15:30-17:00 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom H
Session Chair: Mayra Donaji Barrera Machuca

Impact of multimodal instructions for tool manipulation skills on performance and user experience in an Immersive Environment (Conference: P1762)

Cassandre Simon, Univ Evry,Université Paris Saclay; Manel Boukli Hacene, Univ Evry, Université Paris Saclay; Flavien Lebrun, Univ Evry, Université Paris Saclay; Samir Otmane, Univ Evry, Université Paris Saclay; Amine Chellali, Univ Evry, Université Paris Saclay

This study explores the use of multimodal communication to convey instructions to learners on the amplitude of movements to perform in an immersive environment. The study aims to examine the impact of four modality combinations on performance, workload, and user experience. The results show that participants achieved higher accuracy with the visual-haptic and verbal-visual-haptic conditions. Moreover, they performed the movements faster, and their movement trajectories were closer to the reference trajectories in the visual-haptic condition. Finally, the most preferred verbal-visual-haptic combination enhanced the sense of presence, co-presence, social presence, and learning experience. No impact on workload was observed.

Best Paper Honorable Mention

Robust Dual-Modal Speech Keyword Spotting for XR Headsets (Journal: P1317)

Zhuojiang Cai, Beihang University; Yuhan Ma, Beihang University; Feng Lu, Beihang University

While speech interaction finds widespread utility within the Extended Reality (XR) domain, conventional vocal speech keyword spotting systems continue to grapple with formidable challenges, including suboptimal performance in noisy environments, impracticality in situations requiring silence, and susceptibility to inadvertent activations when others speak nearby. These challenges, however, can potentially be surmounted through the cost-effective fusion of voice and lip movement information. Consequently, we propose a novel vocal-echoic dual-modal keyword spotting system for XR headsets. Experimental results demonstrate the promising performance of this dual-modal system across various challenging scenarios.

Modeling the Intent to Interact with VR using Physiological Features (Invited Journal: P3009)

Willy Nguyen, Universite Paris-Saclay, France; Klaus Gramann, TU Berlin, Germany; Lukas Gehrke, TU Berlin, Germany

Objective . Mixed-Reality (XR) technologies promise a user experience (UX) that rivals the interactive experience with the real-world. The key facilitators in the design of such a natural UX are that the interaction has zero lag and that users experience no excess mental load. This is difficult to achieve due to technical constraints such as motion-to-photon latency as well as false-positives during gesture-based interaction.

CAEVR: Biosignals-Driven Context-Aware Empathy in Virtual Reality (Journal: P1845)

Kunal Gupta, The University of Auckland; Yuewei Zhang, The University of Auckland; Tamil Selvan Gunasekaran, The University of Auckland; Nanditha Krishna, Amrita Vishwa Vidyapeetham; Yun Suen Pai, Keio University Graduate School of Media Design; Mark Billinghurst, The University of Auckland

This study examines the impact of Context-Aware Empathic VR (CAEVR) on users' emotions and cognition in VR. We developed personalized and generalized emotion recognition models using real-time electroencephalography, electrodermal activity, and heart rate variability data. These models were applied in a Context-Aware Empathic virtual agent and an Emotion-Adaptive VR environment. Results show increased positive emotions, cognitive load, and empathy towards the CAE agent. This suggests CAEVR's potential for enhancing user-agent interactions. The paper concludes with lessons and future research directions.

VRMN-bD: A Multi-modal Natural Behavior Dataset of Immersive Human Fear Responses in VR Stand-up Interactive Games (Conference: P1128)

He Zhang, Tsinghua University; Xinyang Li, Tsinghua University; Yuanxi Sun, Communication University of China; Xinyi Fu, Tsinghua University; Christine Qiu, KTH Royal Institute of Technology; John M. Carroll, Pennsylvania State University

This study focuses on analyzing fear in virtual reality (VR) using horror games. We collected multi-modal data (posture, audio, physiological signals) from 23 participants to understand fear responses. Our LSTM-based model achieved prediction accuracies of 65.31% (6-level fear classification) and 90.47% (2-level fear classification). We developed the VRMNbD, a unique multi-modal dataset focusing on natural human fear responses in VR interactive environments, surpassing existing datasets in data scale, collection method, and audience scope. This research contributes to advancements in immersive game development, scene creation, and virtual human-computer interactions by providing insights into fear emotions in VR environments.

Session: Perception and Cognition (MO2J)

Date & Time: Monday, 18 March 2024, 15:30-17:00 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom J
Session Chair: Bobby Bodenheimer

ACHOO - Bless you! Sense of Presence can provoke Proactive Mucosal Immune Responses in Immersive Human-Agent Interactions (Conference: P1247)

Judith Katharina Keller, Universität Hamburg; Agon Kusari, Universität Hamburg; Sophie Czok, Universität Hamburg; Birgit Simgen, MVZ Volkmann Laboratory; Frank Steinicke, Universität Hamburg; Esther Diekhof, Universität Hamburg

Previous work suggests that the mere visual perception of disease cues can proactively enhance mucosal immune responses even without actual pathogen exposure. We present the first immersive immunological experiment, which investigates if social interactions with virtual agents in VR can lead to a mucosal immune response, in particular, a proactive release of secretory immunoglobin A (sIgA) in saliva. Therefore, we simulated a virtual bus stop scenario in which participants were required to closely approach and establish eye contact with ten agents. We found that sIgA secretion increased when agents sneezed as well as when they did not sneeze. In the latter, the increase was correlated with the perceived involvement and sense of presence.

Best Paper Award

Evaluating Text Reading Speed in VR Scenes and 3D Particle Visualizations (Journal: P1072)

Johannes Novotny PhD, VRVis Zentrum für Virtual Reality und Visualisierung; David H. Laidlaw, Brown University

We report on the effects of text size and display parameters on reading speed and legibility in three state-of-the-art VR displays. Two are head-mounted displays, and one is Brown’s CAVE-like YURT. Our two perception experiments uncover limits where reading speed declines as the text size approaches the so-called critical print sizes (CPS) of individual displays. We observe an inverse correlation between display resolution and CPS, revealing hardware-specific limitations on legibility beyond display resolution, making CPS an effective benchmark for VR devices. Additionally, we report on the effects of text panel placement, orientation, and occlusion-reducing rendering methods on reading speeds in volumetric particle visualization.

Best Presentation Award

The Effects of Auditory, Visual, and Cognitive Distractions on Cybersickness in Virtual Reality (Invited Journal: P3014)

Rohith Venkatakrishnan, School of Computing, Clemson University, USA; Roshan Venkatakrishnan, School of Computing, Clemson niversity, USA; Balagopal Raveendranath, Department of Psychology, Clemson University, USA; Dawn M. Sarno, Department of Psychology, Clemson University, USA; Andrew C. Robb, School of Computing, Clemson University, USA; Wen-Chieh Lin, Department of Computer Science, National Yang Ming Chiao Tung University, Taiwan; Sabarish V. Babu, School of Computing, Clemson University, USA

Cybersickness (CS) is one of the challenges that has hindered the widespread adoption of Virtual Reality (VR). Consequently, researchers continue to explore novel means to mitigate the undesirable effects associated with this affliction, one that may require a combination of remedies as opposed to a solitary stratagem. Inspired by research probing into the use of distractions as a means to control pain, we investigated the efficacy of this countermeasure against CS, studying how the introduction of temporally time-gated distractions affects this malady during a virtual experience featuring active exploration. Downstream of this, we discuss how other aspects of the VR experience are affected by this intervention. We discuss the results of a between-subjects study manipulating the presence, sensory modality, and nature of periodic and short-lived (5-12 seconds) distractor stimuli across 4 experimental conditions: (1) no-distractors (ND); (2) auditory distractors (AD); (3) visual distractors (VD); (4) cognitive distractors (CD). Two of these conditions (VD and AD) formed a yoked control design wherein every matched pair of ‘seers’ and ‘hearers’ was periodically exposed to distractors that were identical in terms of content, temporality, duration, and sequence. In the CD condition, each participant had to periodically perform a 2-back working memory task, the duration and temporality of which was matched to distractors presented in each matched pair of the yoked conditions. These three conditions were compared to a baseline control group featuring no distractions. Results indicated that the reported sickness levels were lower in all three distraction groups in comparison to the control group. The intervention was also able to both increase the amount of time users were able to endure the VR simulation, as well as avoid causing detriments to spatial memory and virtual travel efficiency. Overall, it appears that it may be possible to make users less consciously aware and bothered by the symptoms of CS, thereby reducing its perceived severity.

Investigating Personalization Techniques for Improved Cybersickness Prediction in Virtual Reality Environments (Journal: P1370)

Umama Tasnim, University of Texas at San Antonio; Rifatul Islam, Kennesaw State University; Kevin Desai, The University of Texas at San Antonio; John Quarles, University of Texas at San Antonio

Recent cybersickness research uses physiological data (HR, EDA) for prediction. However, the role of individual factors like age and gender in these models is unclear. Our study aims to fill this gap, advocating for personalized cybersickness prediction models for an inclusive virtual reality experience. We tested four personalization techniques: data grouping, transfer learning, early shaping, and sample weighing, using an open-source dataset. Results showed a marked improvement in prediction accuracy; for example, DeepTCN's early shaping reduced RMSE by 69.7% compared to the generic model. This underscores the potential of personalized models in enhancing cybersickness prediction, paving the way for future tailored reduction techniques.

Age and Gender Differences in the Pseudo-Haptic Effect on Computer Mouse Operation in a Desktop Environment (Invited Journal: P3011)

Yuki Ban, Graduate School of Frontier Sciences, University of Tokyo, Chiba, Japan; Yusuke Ujitoko, NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation, Atsugi, Japan

Pseudo-haptics is a method that can provide a haptic sensation without requiring a physical haptic device. The effect of pseudo-haptics is known to depend on the individual, but it is unclear which factors cause individual differences. As the first study establishing a calibration method for these differences in future research, we examined the differences in the pseudo-haptic effect on mouse cursor operation in a desktop environment depending on the age and gender of the user. We conducted an online experiment and collected data from more than 400 participants. The participants performed a task of lifting a virtual object with a mouse pointer. We found that the effect of pseudo-haptics was greater in younger or male participants than in older or female participants. We also found that the effect of pseudo-haptics, which varied with age and gender, can be explained by habituation to the mouse in daily life and the accuracy of detecting the pointer position using vision or proprioception. Specifically, the pseudo-haptic effect was higher for those who used the mouse more frequently and had higher accuracy in identifying the pointer position using proprioception or vision. The results of the present study not only indicate the factors that cause age and gender differences but also provide hints for calibrating these differences.

Multimodal Physiological Analysis of Impact of Emotion on Cognitive Control in VR (Journal: P1899)

Ming Li, Beihang University; Junjun Pan, Beihang University; Yu Li, Beijing Normal University; Yang Gao, Beihang university; Hong Qin, Stony Brook University; Yang Shen, Collaborative Innovation Center of Assessment for Basic Education Quality, Beijing Normal University

Cognitive control is perplexing to elucidate and can be influenced by emotions. Understanding individual cognitive control in VR is crucial for adaptive applications. In this study, we investigate the influence of emotions on cognitive control based on the arousal valence model. 26 participants are recruited, inducing emotions through VR videos and then performing related cognitive control tasks. Leveraging EEG, HRV, and EDA, we employ deep learning to categorize cognitive control levels. The experiment results demonstrate that high-arousal emotions significantly enhance users’ cognitive control abilities and achieve an accuracy of 84.52% in distinguishing between high and low cognitive control.

Session: 3D Interaction and Touch (TU1G)

Date & Time: Tuesday, 19 March 2024, 8:30-9:45 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom G
Session Chair: Bruce Thomas

Exploring Bimanual Haptic Feedback for Spatial Search in Virtual Reality (Journal: P1812)

BoYu Gao, Jinan University; Tong Shao, Jinan University; Huawei Tu, La Trobe University; Qizi Ma, Jinan University; Zitao Liu, Jinan University; Teng Han, Institute of Software, Chinese Academy of Sciences

Spatial search tasks are common and crucial in many Virtual Reality (VR) applications. In this work, we explored bimanual haptic feedback with various combinations of haptic properties, where four types of bimanual haptic feedback were designed, for spatial search tasks in VR. Two experiments were designed to evaluate the effectiveness of bimanual haptic feedback on spatial direction guidance and search in VR. The results of the experiments showed that the proposed bimanual haptic feedback can provide more efficient and accurate performance than the baselines for spatial guidance and search in VR. Based on these findings, we have derived a set of design recommendations for spatial search using bimanual haptic feedback in VR.

FanPad: A Fan Layout Touchpad Keyboard for Text Entry in VR (Conference: P1301)

Jian Wu, Beihang University; Ziteng Wang, Beihang University; Lili Wang, Beihang University; Jiaheng Li, Beihang University; Yuhan Duan, Beihang University

Text entry poses a significant challenge in virtual reality (VR). This paper introduces FanPad, a novel solution that facilitates dual-hand text input within head-mounted displays (HMDs). FanPad accomplishes this by ingeniously mapping and curving the QWERTY keyboard onto the touchpads of both controllers. The curved key layout of FanPad is derived from the natural movement of the thumb when interacting with the touchpad, resembling an arc with a thumb-length fixed radius. We have conducted two comprehensive user studies to assess and evaluate the performance of our FanPad method. Notably, novices achieved a typing speed of 19.73 words per minute (WPM). The highest typing speed reached an impressive 24.19 WPM.

Eye-Hand Coordination Training: A Systematic Comparison of 2D, VR, and AR Screen Technologies and Task Motives (Conference: P1309)

Aliza Aliza, Kadir Has University; Irene zaugg, Colorado State University; Elif Çelik, Kadir Has University; Wolfgang Stuerzlinger, Simon Fraser University; Francisco Raul Ortega, Colorado State University; Anil Ufuk Batmaz, Concordia University; Mine Sarac, Kadir Has University

In this paper, we compare user motor performance with an EHCT setup in Augmented Reality (AR), Virtual Reality (VR), and on a 2D touchscreen display in a longitudinal study. Through a ten-day user study, we thoroughly analyzed the motor performance of twenty participants with five task instructions focusing on speed, error rate, accuracy, precision, and none. As a novel evaluation criterion, we also analyzed the participants’ performance in terms of effective throughput. The results showed that each task instruction has a different effect on one or more psychomotor characteristics of the trainee, which high- lights the importance of personalized training programs.

The Benefits of Near-field Manipulation and Viewing to Distant Object Manipulation in VR (Conference: P1663)

Sabarish V. Babu, Clemson University; Wei-An Hsieh, National Yang Ming Chiao Tung University; Jung-Hong Chuang, National Yang Ming Chiao Tung University

In this contribution, we propose to enhance two distant object manipulation techniques, BMSR and Scaled HOMER, via near-field scaled replica manipulation and viewing. In the proposed Direct BMSR, users are allowed to directly manipulate the target replica in their arm's reach space. In Scaled HOMER+NFSRV, Scaled HOMER is augmented with a near-field scaled replica view of the target object and its context. We conducted a between-subjects empirical evaluation of BMSR, Direct BMSR, Scaled HOMER, and Scaled HOMER+NFSRV. Our findings revealed that Direct BMSR and Scaled HOMER+NFSRV significantly outperformed BMSR and Scaled HOMER, respectively, in terms of accuracy.

Evaluating an In-Hand Ball-Shaped Controller for Object Manipulation in Virtual Reality (Conference: P1978)

Sunbum Kim, School of Computing, KAIST; Geehyuk Lee, School of Computing, KAIST

This study explored the use of a ball-shaped controller for object manipulation in virtual reality. We developed a ball-shaped controller with pressure-sensing capabilities featuring specifically designed interactions for object manipulation, including selection, translation, rotation, and scaling. We evaluated it on tasks involving both 6-DOF and 7-DOF object manipulation, including close and distant ranges. The results indicated that the ball-shaped controller performed similarly to VR controller methods for direct manipulation but excelled in reducing completion times and task load for distant object manipulation. Additionally, the ball-shaped controller minimized wrist and arm movements and was the preferred method among participants.

Session: Multisensory Interfaces (TU1H)

Date & Time: Tuesday, 19 March 2024, 8:30-9:45 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom H
Session Chair: Daniel Roth

Exploring audio interfaces for vertical micro-guidance in augmented reality via hand-based feedback (Journal: P1062)

Renan L Martins Guarese, RMIT; Emma Pretty, RMIT; Aidan Renata, Royal Melbourne Institute of Technology; Debra Polson, Queensland University of Technology; Fabio Zambetta, RMIT University

This research proposes an evaluation of pitch-based sonification methods via user experiments in real-life scenarios, specifically vertical guidance, with the aim of standardizing the use of audio interfaces in AR in guidance tasks. Using literature on assistive technology for people who are visually impaired, we aim to generalize their applicability to a broader population and for different use cases. We propose and test sonification methods for vertical guidance in hand-navigation assessments with users without visual feedback. Including feedback from a visually impaired expert in digital accessibility, results outlined that methods that do not rely on memorizing pitch had the most promising accuracy and workload performances.

Listen2Scene: Interactive material-aware binaural sound propagation for reconstructed 3D scenes (Conference: P1075)

Anton Jeran Ratnarajah, University of Maryland, College Park; Dinesh Manocha, University of Maryland

We present an end-to-end binaural audio rendering approach (Listen2Scene) for VR and AR applications. We propose a novel neural-network-based binaural sound propagation method to generate acoustic effects for indoor 3D models of real environments. Any clean audio or dry audio can be convolved with the generated acoustic effects to render audio corresponding to the real environment. We have evaluated the accuracy of our approach with binaural acoustic effects generated using an interactive geometric sound propagation algorithm and captured real acoustic effects / real-world recordings. The demo videos, code and dataset are available online \footnote{\url{https://anton-jeran.github.io/Listen2Scene/}.

Influence of user's body in olfactory virtual environment generated by real-time CFD (Conference: P1513)

Masafumi Uda, School of Engineering; Takamichi Nakamoto, Tokyo Institute of Technology

We have developed a virtual olfactory environment using an olfactory display and computational fluid dynamics (CFD) simulation. Although CFD can calculate the odor distribution in the complicated geometry, its computational cost was expensive and did not work in real time in the previous study. In this study, real-time CFD based on GPU calculation was introduced to generate real-time olfactory VR environment. We investigated influence of the user's body with its location and orientation changing irregularly. The experimental result indicates the usefulness of considering the effect of the user's body since we cannot avoid that influence.

OdorAgent: Generate Odor Sequences for Movies Based on Large Language Model (Conference: P1942)

Yu Zhang, Tsinghua University; Peizhong Gao, Tsinghua University; Fangzhou Kang, Tsinghua Univerity; Jiaxiang Li, Tsinghua University; Jiacheng Liu, Tsinghua University; Qi Lu, Tsinghua University; YINGQING XU, Tsinghua University

Numerous studies have shown that integrating scents into movies enhances viewer engagement and immersion. However, creating such olfactory experiences often requires professional perfumers to match scents, limiting their widespread use. To address this, we propose OdorAgent which combines a LLM with a text-image model to automate video-odor matching. The generation framework is in four dimensions: subject matter, emotion, space, and time. We applied it to a specific movie and conducted user studies to evaluate and compare the effectiveness of different system elements. The results indicate that OdorAgent possesses significant scene adaptability and enables inexperienced individuals to design odor experiences for video and images.

EmoFace: Audio-driven Emotional 3D Face Animation (Conference: P1973)

Chang Liu, Shanghai Jiao Tong University; Qunfen Lin, Tencent Games; Zijiao Zeng, Tencent Games; Ye Pan, Shanghai Jiaotong University

EmoFace is a novel audio-driven methodology for creating facial animations with vivid emotional dynamics. It has the ability to generate dynamic facial animations with diverse emotions, synchronized lip movements, and natural blinks. Incorporating independent speech and emotion encoders, our approach establishes a robust link between audio, emotion, and facial controller rigs. Post-processing techniques enhance authenticity, focusing on blinks and eye movements. We also contribute an emotional audio-visual dataset and derive control parameters for each frames to drive MetaHuman models. Quantitative assessments and user studies validate the efficacy of our innovative approach.

Session: Evaluating Immersion: UX and Interaction (TU1J)

Date & Time: Tuesday, 19 March 2024, 8:30-9:45 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom J
Session Chair: Doug Bowman

Task-based methodology to characterise immersive user experience with multivariate data (Conference: P1181)

Florent Robert, Université Côte d'Azur; Hui-Yin Wu, Centre Inria d'Université Côte d'Azur; Lucile Sassatelli, Université Côte d'Azur; Marco Winckler, Université Côte d'Azur

Virtual Reality technologies is promising for research, however, the evaluation of the user experience in immersive environments is daunting, the richness of the media presents challenges to synchronise context with behavioural metrics. We propose a task-based methodology that provides fine-grained descriptions and analyses of the experiential user experience in VR that (1) aligns low-level tasks with behavioural metrics, (2) defines performance components with baseline values to evaluate task performance, and (3) characterise task performance with multivariate user behaviour data. We find that the methodology allows us to better observe the experiential user experience by highlighting relations between user behaviour and task performance.

NeRF-NQA: No-Reference Quality Assessment for Scenes Generated by NeRF and Neural View Synthesis Methods (Journal: P1221)

Qiang Qu, The University of Sydney; hanxue Liang, University of Cambridge; Xiaoming Chen, Beijing Technology and Business University; Yuk Ying Chung, The University of Sydney; Yiran Shen, Shandong University

Neural View Synthesis (NVS) creates dense viewpoint videos from sparse images, but traditional metrics like PSNR and SSIM inadequately assess NVS and NeRF-synthesized scenes. Limited dense ground truth views in datasets like LLFF hinder full-reference quality evaluation. Addressing this, we introduce NeRF-NQA, a novel no-reference quality assessment method for NVS and NeRF scenes. It uniquely combines viewwise and pointwise evaluations to assess spatial and angular qualities. Extensive testing against 23 established visual quality methods demonstrates NeRF-NQA's superior performance in assessing NVS-synthesized scenes without reference data, marking a significant advancement in NVS quality assessment.

Exploring Controller-based Techniques for Precise and Rapid Text Selection in Virtual Reality (Conference: P1271)

Jianbin Song, Xi'an Jiaotong-Liverpool University; Rongkai Shi, Xi'an Jiaotong-Liverpool University; Yue Li, Xi'an Jiaotong-Liverpool University; BoYu Gao, Jinan University; Hai-Ning Liang, Xi'an Jiaotong-Liverpool University

Text selection, while common, can be difficult because the letters and words are too small and clustered together to allow precise selection. There has been limited exploration of techniques that support accurate and rapid text selection at the character, word, sentence, or paragraph levels in VR HMDs. We present three controller-based text selection methods: Joystick Movement, Depth Movement, and Wrist Orientation. They are evaluated against a baseline method via a user study with 24 participants. Results show that the three proposed techniques significantly improved the performance and user experience over the baseline, especially for the selection beyond the character level.

Evaluating the Effect of Binaural Auralization on Audiovisual Plausibility and Communication Behavior in Virtual Reality (Conference: P1872)

Felix Immohr, Technische Universität Ilmenau; Gareth Rendle, Bauhaus-Universität Weimar; Anton Benjamin Lammert, Bauhaus-Universität Weimar; Annika Neidhardt, University of Surrey; Victoria Meyer Zur Heyde, Technische Universität Ilmenau; Bernd Froehlich, Bauhaus-Universität Weimar; Alexander Raake, TU Ilmenau

Spatial audio has been shown to positively impact user experience in traditional communication media and presence in single-user VR. This work further investigates whether spatial audio benefits immersive communication scenarios. We present a study in which dyads communicate in VR under different auralization and scene arrangement conditions. A novel task is designed to increase the relevance of spatial hearing. Results are obtained through social presence and plausibility questionnaires, and through conversational and behavioral analysis. Although participants are shown to favor binaural over diotic audio in a direct comparison, no significant differences were observed from the other presented measures.

Eyes on the Task: Gaze Analysis of Situated Visualization for Collaborative Tasks (Conference: P1898)

Nelusa Pathmanathan, University of Stuttgart; Tobias Rau, University of Stuttgart; Xiliu Yang, Institute of Computational Design and Construction; Aimée Sousa Calepso, University of Stuttgart; Felix Amtsberg, Institute of Computational Design and Construction; Achim Menges, University of Stuttgart; Michael Sedlmair, University of Stuttgart; Kuno Kurzhals, University of Stuttgart

The use of augmented reality technology to support humans within complex and collaborative tasks has gained increasing importance. Analyzing collaboration patterns is usually done by conducting observations and interviews. We argue that eye tracking can be used to extract further insights and quantify behavior. To this end, we contribute a study that uses eye tracking to investigate participant strategies for solving collaborative sorting and assembly tasks. We compare participants' visual attention during situated instructions in AR and traditional paper-based instructions as a baseline. By investigating the performance and gaze behavior of the participants, different strategies for solving the provided tasks are revealed.

Session: Locomotion and Redirection (TU2G)

Date & Time: Tuesday, 19 March 2024, 13:30-15:00 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom G
Session Chair: Evan Ruma Rosenberg

illumotion: An Optical-illusion-based VR Locomotion Technique for Long-Distance 3D Movement (Conference: P1841)

Zackary P. T. Sin, The Hong Kong Polytechnic University; Ye Jia, The Hong Kong Polytechnic University; Chen Li, The Hong Kong Polytechnic University; Hong Va Leong, The Hong Kong Polytechnic University; qing li, The Hong Kong Polytechnic University; Peter H.F. Ng, The Hong Kong Polytechnic University

Locomotion has a marked impact on user experience in VR, but common to-go techniques such as steering and teleportation have their limitations. Particularly, steering is prone to cybersickness, while teleportation trades presence for mitigating cybersickness. Inspired by how we manipulate a picture on a phone, we propose illumotion, an optical-illusion-based method that, we believe, can be an alternative for locomotion. Instead of zooming in a picture by pinching two fingers, we can move forward by “zooming” toward part of the 3D virtual scene with pinched hands. Results show that, compared with either teleportation, steering or both, illumotion has better performance, presence, usability, user experience and cybersickness alleviation.

Exploring Visual-Auditory Redirected Walking using Auditory Cues in Reality (Invited Journal: P3007)

Kumpei Ogawa; Kazuyuki Fujita; Shuichi Sakamoto; Kazuki Takashima; Yoshifumi Kitamura

We examine the effect of auditory cues occurring in reality on redirection. Specifically, we set two hypotheses: the auditory cues emanating from fixed positions in reality (Fixed sound, FS) increase the noticeability of redirection, while the auditory cues whose positions are manipulated consistently with the visual manipulation (Redirected sound, RDS) decrease the noticeability of redirection. To verify these hypotheses, we implemented an experimental environment that virtually reproduced FS and RDS conditions using binaural recording, and then we conducted a user study (N = 18) to investigate the detection thresholds (DTs) for rotational manipulation and the sound localization accuracy of the auditory cues under FS and RDS, as well as the baseline condition without auditory cues (No sound, NS). The results show, against the hypotheses, FS gave a wider range of DTs than NS, while RDS gave a similar range of DTs to NS. Combining these results with those of sound localization accuracy reveals that, rather than the auditory cues affecting the participants’ spatial perception in VR, the visual manipulation made their sound localization less accurate, which would be a reason for the increased range of DTs under FS. Furthermore, we conducted a follow-up user study (N = 11) to measure the sound localization accuracy of FS where the auditory cues were actually placed in a real setting, and we found that the accuracy tended to be similar to that of virtually reproduced FS, suggesting the validity of the auditory cues used in this study. Given these findings, we also discuss potential applications.

RedirectedDoors+: Door-Opening Redirection with Dynamic Haptics in Room-Scale VR (Journal: P1074)

Yukai Hoshikawa, Tohoku University; Kazuyuki Fujita, Tohoku University; Kazuki Takashima, Tohoku University; Morten Fjeld, Chalmers University of Technology; Yoshifumi Kitamura, Tohoku University

RedirectedDoors+ is a robot-based VR system that enhances the original RedirectedDoors by offering dynamic haptics during consecutive door openings. It utilizes door robots, a robot-positioning algorithm for just-in-time haptic feedback, and a user-steering algorithm for user navigation within limited areas. A simulation study, tested in six VR environments, reveals our system’s performance in relation to user walking speed, paths, and the number of door robots, leading to the derivation of usage guidelines. Additionally, a study with 12 participants confirms the system’s effectiveness in providing haptic feedback and redirecting users in confined spaces during a walkthrough application.

Virtual Steps: The Experience of Walking for a Lifelong Wheelchair User in Virtual Reality (Conference: P1509)

Atieh Taheri, University of California, Santa Barbara; Arthur Caetano, University of California, Santa Barbara; Misha Sra, UCSB

We co-designed a VR walking experience with a person with Spinal Muscular Atrophy who has been a lifelong wheelchair user. Over 9 days, we collected and analyzed data on this person's experience through a diary study to understand the required design elements. Given that they had only seen others walking and had not directly experienced it, determining which design parameters must be considered to match the virtual experience to their mental model was challenging. Generally, we found the experience of walking to be quite positive, providing a perspective from a higher vantage point than what was available in a wheelchair.

APF-S2T: Steering to Target Redirection Walking Based on Artificial Potential Fields (Journal: P1943)

Jun-Jie Chen, National Yang Ming Chiao Tung University; Huan-Chang Hung, National Yang Ming Chiao Tung University; Yu-Ru Sun, National Yang Ming Chiao Tung University; Jung-Hong Chuang, National Yang Ming Chiao Tung University

This paper introduced a novel APF-based redirected walking controller, called APF Steer-to-Target (APF-S2T). Different from previous APF-based controllers, APF-S2T locates the target sample with the lowest score within the user's walkable areas in both physical and virtual space. The score is defined based on the APF values and distance to the user. The direction from the user position to the target sample serves as the steering direction that will be used to set RDW gains. A comparative simulation-based evaluation reveals that APF-S2T performed better than the state-of-the-art controllers in terms of the number of reset and the average reset distance.

SafeRDW: Keep VR Users Safe When Jumping with Redirected Walking (Conference: P2110)

Sen-Zhe Xu, Tsinghua University; Kui Huang, Tsinghua University; Cheng-Wei Fan, Tsinghua University; Song-Hai Zhang, Tsinghua University

Existing redirected walking (RDW) algorithms typically focus on reducing collisions between users and obstacles during walking but overlook the safety when users perform significant actions such as jumping. This oversight can pose serious risks to users during VR exploration, especially when there are physical obstacles or boundaries near the virtual locations that require user jumping. We propose SafeRDW, the first RDW algorithm that takes the user's jumping safety into consideration. The proposed method can redirect users to safe physical locations when a jump is required in the virtual space, ensuring user safety. Simulation experiments and user study results both show that our method not only reduces the number of resets, but also significantly ensures user safety when they reach the jumping points in the virtual scene.

Session: Projections (TU2H)

Date & Time: Tuesday, 19 March 2024, 13:30-15:00 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom H
Session Chair: Bruce Thomas

ViComp: Video Compensation for Projector-Camera Systems (Journal: P1096)

Yuxi Wang, Hangzhou Dianzi University; Haibin Ling, Stony Brook University; Bingyao Huang, Southwest University

Projector video compensation aims to cancel the geometric and photometric distortions caused by non-ideal projection surfaces and environments when projecting videos. This paper builds an online video compensation system that compensates frames and adjusts model parameters in parallel. By integrating an efficient deep learning-based compensation model, our system can be rapidly configured to unknown environments with good performance. Moreover, the proposed update strategy combining long-term and short-term memory mechanisms enables the compensation model to adapt to the target configuration and video content without manual intervention. Experiments show that our system significantly outperforms state-of-the-art baselines.

Best Paper Award

Projection Mapping under Environmental Lighting by Replacing Room Lights with Heterogeneous Projectors (Journal: P1121)

Masaki Takeuchi, Osaka University; Hiroki Kusuyama, Osaka University; Daisuke Iwai, Osaka University; Kosuke Sato, Osaka University

Projection mapping (PM) typically requires a dark environment to achieve high-quality projections, limiting its practicality. In this paper, we overcome this limitation by replacing conventional room lighting with heterogeneous projectors. These projectors replicate environmental lighting by selectively illuminating the scene, excluding the projection target. Our contributions include a distributed projector optimization framework designed to effectively replicate environmental lighting and the incorporation of a large-aperture projector to reduce high-luminance emitted rays and hard shadows. Our findings demonstrate that our projector-based lighting system significantly enhances the contrast and realism of PM results.

Projection Mapping with a Brightly Lit Surrounding Using a Mixed Light Field Approach (Journal: P1606)

Masahiko Yasui, Tokyo Institute of Technology; Ryota Iwataki, Tokyo Institute of Technology; Masatoshi Ishikawa, Tokyo University of Science; Yoshihiro Watanabe, Tokyo Institute of Technology

Projection mapping (PM) exhibits suboptimal performance in well-lit environments because of the ambient light. This interference degrades the contrast of the projected images. To overcome these limitations, we introduce an innovative approach that leverages a mixed light field, blending traditional PM with ray-controllable ambient lighting. This methodological combination ensures that the projector exclusively illuminates the PM target, preserving the optimal contrast. Furthermore, we propose the integration of a kaleidoscopic array with integral photography to generate dense light fields for ray-controllable ambient lighting. Our optical simulations and the developed system collectively validate the effectiveness of our approach.

Real-time Seamless Multi-Projector Displays on Deformable Surfaces (Journal: P2001)

Muhammad Twaha Ibrahim, UC Irvine; Gopi Meenakshisundaram, University of California, Irvine; Aditi Majumder, UCI

Prior work on multi-projector displays on deformable surfaces have focused have focused mostly on small scale single projector displays. In this work, we present the first end-to-end solution for achieving a real-time, seamless display on deformable surfaces using mutliple unsychronized projectors without requiring any prior knowledge of the surface or device parameters. Using multiple projectors and RGB-D cameras, we provide the much desired aspect of scale to the displays on deformable and dynamic surfaces. This work has tremendous applications on mobile and expeditionary systems such as military or emergency operations in austere locations, or displays on inflatable objects for tradeshows/events and touring edutainment applications.

Best Paper Honorable Mention

Towards Co-operative Beaming Displays: Dual Steering Projectors for Extended Projection Volume and Head Orientation Range (Journal: P2043)

Hiroto Aoki, The University of Tokyo; Takumi Tochimoto, Tokyo Institute of Technology; Yuichi Hiroi, Cluster Inc.; Yuta Itoh, The University of Tokyo

This study tackles trade-offs in existing near-eye displays (NEDs) by introducing a beaming display with dual steering projectors. While the traditional NED faces challenges in size, weight, and user limitations, the beaming display separates the NED into a steering projector (SP) and a passive headset. To overcome issues with a single SP, dual projectors are distributed to extend head orientation. A geometric model and calibration method for multiple projectors are proposed. The prototype achieves precision (1.8 ∼ 5.7 mm) and delay (14.46 ms) at 1m, projecting images in the passive headset's area (20 mm × 30 mm) and enabling multiple users with improved presentation features.

3D Gamut Morphing for Non-Rectangular Multi-Projector Displays (Invited Journal: P3006)

Mahdi Abbaspour Tehrani, Genentech., USA; Muhammad Twaha Ibrahim, University of California, Irvine, USA; Aditi Majumder, University of California, Irvine, USA; M. Gopi, University of California, Irvine, USA

In a spatially augmented reality system, multiple projectors are tiled on a complex shaped surface to create a seamless display on it. This has several applications in visualization, gaming, education and entertainment. The main challenges in creating seamless and undistorted imagery on such complex shaped surfaces are geometric registration and color correction. Prior methods that provide solutions for the spatial color variation in multi-projector displays assume rectangular overlap regions across the projectors that is possible only on flat surfaces with extremely constrained projector placement. In this paper, we present a novel and fully automated method for removing color variations in a multi-projector display on arbitrary shaped smooth surfaces using a general color gamut morphing algorithm that can handle any arbitrarily shaped overlap between the projectors and assures imperceptible color variations across the display surface.

Session: 3D Interaction and Teleoperation (TU2J)

Date & Time: Tuesday, 19 March 2024, 13:30-15:00 (Orlando, Florida USA UTC-4)
Room: Fantasia Ballroom J
Session Chair: Tim Weisker

Asynchronously Assigning, Monitoring, and Managing Assembly Goals in Virtual Reality for High-Level Robot Teleoperation (Conference: P1006)

Shutaro Aoyama, Columbia University; Jen-Shuo Liu, Columbia University; Portia Wang, Columbia University; Shreeya Jain, Columbia University; Xuezhen Wang, Columbia University; Jingxi Xu, Columbia University; Shuran Song, Columbia University; Barbara Tversky, Columbia Teachers College; Steven Feiner, Columbia University

We present a prototype virtual reality user interface for robot teleoperation that supports high-level goal specification in remote assembly tasks. Users interact with virtual replicas of task objects. They asynchronously assign multiple goals in the form of 6DoF destination poses without needing to be familiar with specific robots and their capabilities, and manage and monitor the execution of these goals. The user interface employs two different spatiotemporal visualizations for assigned goals: one represents all goals within the user’s workspace, while the other depicts each goal within a separate world in miniature. We conducted a user study of the interface without the robot system to compare these visualizations.

Exploring Bi-Manual Teleportation in Virtual Reality (Conference: P1602)

Siddhanth Raja Sindhupathiraja, Indian Institute of Technology Delhi; A K M Amanat Ullah, University of British Columbia, Okanagan; William Delamare, ESTIA; Khalad Hasan, University of British Columbia

Enhanced hand tracking in modern VR headsets has popularized hands-only teleportation that allows instantaneous movement within VR environments. However, previous works on hands-only teleportation have not fully explored - the potential of bi-manual input (where each hand plays a distinct role), the influence of users’ posture (whether sitting or standing), or assessments based on human motor models (such as Fitts’ Law). To address these gaps, we conducted a user study (N=20) to compare the performance of bi-manual and uni-manual techniques in VR teleportation tasks using a proposed Fitts’ Law model, considering both the postures. Results showed that bi-manual techniques enable faster teleportation and are more accurate than other methods.

BaggingHook: Selecting Moving Targets by Pruning Distractors Away for Intention-Prediction Heuristics in Dense 3D Environments (Conference: P1733)

Paolo Boffi, King Abdullah University of Science and Technology (KAUST); Alexandre Kouyoumdjian, King Abdullah University of Science and Technology; Manuela Waldner, TU Wien; Pier Luca Lanzi, Politecnico di Milano; Ivan Viola, King Abdullah University of Science and Technology

This study presents two novel selection techniques: BaggingHook and AutoBaggingHook, based on distractor pruning and built upon the Hook intention-prediction heuristic. Our techniques reduce the number of targets in the environment, making them semi-transparent, to expedite heuristic convergence and reduce occlusion. BaggingHook allows manual distractor pruning, while AutoBaggingHook employs automated, score-based pruning. Results from a user study that compared both techniques to the Hook baseline show AutoBaggingHook was the fastest, while BaggingHook was preferred by most users for its greater user control. This work highlights the benefits of varying inputs in intention-prediction heuristics to improve performance and user experience.

SkiMR: Dwell-free Eye Typing in Mixed Reality (Conference: P1936)

Jinghui Hu, University of Cambridge; John J Dudley, University of Cambridge; Per Ola Kristensson, University of Cambridge

We present SkiMR: a dwell-free eye typing system that enables fast and accurate hands-free text entry on mixed reality headsets. SkiMR uses a statistical decoder to infer users' intended text based on users' eye movements on a virtual keyboard, bypassing the need for dwell timeouts. We conducted two studies with a HoloLens 2: the first (n=12) showed SkiMR's superiority in speed over traditional dwell-based and a hybrid method. The second study (n=16) focused on composition tasks with a refined system, revealing that users could compose text at 12 words per minute with a 1.1% corrected error rate. Overall, this work demonstrates the high potential for fast and accurate hands-free text entry for MR headsets.