Paper Abstracts for Monday, August 20


11:30 - 11:55   Monday, August 20   Room: Michelangelo,   Paper Session: Production & Tools

AUTHORS:   Michael Costagliola,  Yale University

TITLE:   Multi-user shared augmented audio spaces using motion capture systems

This paper describes a method for creating multi-user shared augmented reality audio spaces. By using a system of infrared cameras and motion capture software, it is possible to provide accurate low-latency head tracking for many users simultaneously, and stream binaural audio representing a realistic, shared virtual environment to each user. Participants can thus occupy and navigate a shared virtual aural space without the use of head-mounted displays, only headphones (with passive markers affixed) connected to lightweight in-ear monitor beltpacks. Potential applications include installation work, classroom use, and museum audio tours.

12:00 - 12:25   Monday, August 20   Room: Michelangelo,   Paper Session: Production & Tools

AUTHORS:   Giordano Jacuzzi,  Sennheiser Electronic GmbH

TITLE:   Augmented Audio: An Overview of the Unique Tools and Features Required for Creating AR Audio Experiences

What a user sees in augmented reality is only part of the experience. To create a truly compelling journey, we must augment what a user hears in reality as well. In this paper, we consider Augmented Audio to be the sound of AR, and discuss a technique by which the binaural rendering of virtual sounds (Augmented) is combined with the manipulation of the real-world sound surrounding a listener (Reality). We outline the unique challenges that arise when designing audio experiences for AR, and document the current state-of-the-art for Augmented Audio solutions. Using the Sennheiser AMBEO Smart Headset as a case study, we describe the essential features of an Augmented Audio device and its integration with an AR application.

12:30 - 12:55   Monday, August 20   Room: Michelangelo,   Paper Session: Production & Tools

AUTHORS:   Jukka Holm and Mark Malyshev,  Tampere University of Applied Sciences, Finland

TITLE:   Spatial Audio Production for 360-Degree Live Music Videos: Multi-Camera Case Studies

This paper discusses the different aspects of mixing for 360-degree multi-camera live music videos. We describe our two spatial audio production workflows, which were developed and fine-tuned through a series of case studies including rock, pop, and orchestral music. The different genres were chosen to test if the production tools and techniques were equally efficient for mixing different types of music. In our workflows, one of the most important parts of the mixing process is to match the Ambisonics mix with a stereo reference. Among other things, the process includes automation, proximity effects, creating a sense a space, and managing transitions between cameras.

2:00 - 2:25   Monday, August 20   Room: Michelangelo,   Paper Session: Perception & Evaluation

AUTHORS:   Angela Mcarthur, Mark Sandler and Rebecca Stewart,  Queen Mary University of London, UK

TITLE:   Perception of mismatched auditory distance - cinematic VR

This study examines auditory distance discrimination in cinematic virtual reality. Using controlled stimuli with audiovisual distance variations, it determines if mismatched stimuli are detected. It asks if visual conditions - either equally or unequally distanced from the user, and environmental conditions - either a reverberant space as opposed to a freer field, impact accuracy in discrimination between congruent and incongruent aural and visual cues. A Repertory Grid Technique-derived design, whereby participant-specific constructs are translated into numerical ratings, is used. Discrimination of auditory event mismatch was improved for stimuli with varied visual-event distances, though not for equidistant visual events. This may demonstrate that visual cues alert users to matches and mismatches, but can lead responses toward both greater and lesser accuracy.

2:30 - 2:55   Monday, August 20   Room: Michelangelo,   Paper Session: Perception & Evaluation

AUTHORS:   Hanne Stenzel, Philip J.B. Jackson,  University of Surrey, UK, Jon Francombe,  BBC Research and Development, UK

TITLE:   Reaction times of spatially coherent and incoherent signals in a word recognition test

Using conventional sound design, the audio signal in VR applications is often reduced to a static stereophonic signal that is accompanied by a visual signal that allows for interactive behavior such as looking around. In the current test, the influence of spatial offset between the audio and visual signals is investigated using reaction time measurements in a word recognition task. The audio-visual offset is introduced by a video presented at horizontal offset angles between +/-20 degrees, accompanied with a static central audio. Measurements are compared to reaction times from a test where both audio and visual signal are presented with the same offset. Results show that the spatial offset introduces changes in the reaction times exhibiting greater within-participant differences.

3:00 - 3:25   Monday, August 20   Room: Michelangelo,   Paper Session: Perception & Evaluation

AUTHORS:   G. Christopher Stecker,  Vanderbilt University

TITLE:   Toward objective measures of auditory co-immersion in virtual and augmented reality

"Co-immersion" refers to the perception of real or virtual objects as contained within or belonging to a shared multisensory scene. Environmental features such as lighting and reverberation contribute to the experience of co-immersion even when awareness of those features is not explicit. Objective measures of co-immersion are needed to validate user experience and accessibility in augmented-reality applications, particularly those that aim for "face-to-face" quality. Here, we describe an approach that combines psychophysical measurement with virtual-reality games to assess users' sensitivity to room-acoustic differences across concurrent talkers in a simulated complex scene. Eliminating the need for explicit judgments, Odd-one-out tasks allow psychophysical thresholds to be measured and compared directly across devices, algorithms, and user populations. Supported by NIH-R41-DC16578.

3:30 - 3:55   Monday, August 20   Room: Michelangelo,   Paper Session: Perception & Evaluation

AUTHORS:   Olli Rummukainen, Thomas Robotham, Sebastian J. Schlecht, Axel Plinge, Jürgen Herre and Emanuël A. P. Habets,  Fraunhofer, Germany

TITLE:   Audio Quality Evaluation in Virtual Reality: Multiple Stimulus Ranking with Behavior Tracking

Virtual reality systems with multimodal stimulation and up to six degrees-of-freedom movement pose novel challenges to audio quality evaluation. This paper adapts classic multiple stimulus test methodology to virtual reality and adds behavioral tracking functionality. The method is based on ranking by elimination while exploring an audiovisual virtual reality. The proposed evaluation method allows immersion in multimodal virtual scenes while enabling comparative evaluation of multiple binaural renderers. A pilot study is conducted to evaluate feasibility of the proposed method and to identify challenges in virtual reality audio quality evaluation. Finally, the results are compared to a non-immersive off-line evaluation method.

4:00 - 4:25   Monday, August 20   Room: Michelangelo,   Paper Session: Perception & Evaluation

AUTHORS:   Gregory Reardon, Andrea Genovese, Gabriel Zalles, and Agnieszka Roginska,  New York University, Patrick Flanagan,  THX Ltd.

TITLE:   Evaluation of Binaural Renderers: Multidimensional Sound Quality Assessment

A multi-phase subjective experiment evaluating six commercially available binaural audio renderers was carried out. This paper presents the methodology, evaluation criteria, and main findings of the tests which assessed perceived sound quality of the renderers. Subjects appraised a number of specific sound quality attributes - timbral balance, clarity, naturalness, spaciousness, and dialogue intelligibility - and ranked, in terms of preference, the renderers for a set of music and movie stimuli presented over headphones. Results indicated that differences between the perceived quality and preference for a renderer are discernible. Binaural renderer performance was also found to be highly content-dependent, with significant interactions between renderers and individual stimuli being found, making it difficult to determine an "optimal" renderer for all settings.

5:00 - 7:00   Monday, August 20   Room: Edison   Poster Session

AUTHORS:   Otto Puomio and Tapio Lokki,  Aalto University, Finland

TITLE:   ICHO: Immersive Concert for Homes

Concert hall experience at home has been limited to stereo and 5.1 surround sound reproduction. However, these reproduction systems do not convey the spatial properties of the concert hall acoustics in detail, and specifically for headphones the sound tends to be perceived as playing inside the head. The ICHO project introduced in this paper aims to bring an immersive concert hall experience to home listeners. This is realized by using close pick-up of sound sources, spatial room impulse responses, and individualized head related transfer functions; all combined together for spatial sound reproduction with head-tracked headphones. This paper outlines how this goal is going to be achieved and how the quality of the reproduction might be evaluated.

5:00 - 7:00   Monday, August 20   Room: Edison   Poster Session

AUTHORS:   Sebastian Nagel, Tobias Kabzinski, Stefan Kühl, Christiane Antweiler and Peter Jax,  Institute of Communication Systems (IKS), RWTH Aachen University, Germany

TITLE:   Acoustic Head-Tracking for Acquisition of Head-Related Transfer Functions with Unconstrained Subject Movement

Recently proposed Head-Related Transfer Function acquisition methods include head-tracking to allow unconstrained subject movements during the measurement. This enables fast measurements with less equipment than previous fast measurement approaches. In this paper, we propose a novel acoustic head-tracking concept, which is particularly suited for this application. Only a head-mounted microphone array and additional recording channels are required in addition to a regular measurement setup. Unlike other head-tracking systems, the tracking data is inherently synchronized to the acoustic measurements and the angle of sound incidence can be accurately determined without knowledge of loudspeaker positions. The concept is developed and evaluated in simulations and measurements, which show that the proposed acoustic head tracker outperforms a comparison device.

5:00 - 7:00   Monday, August 20   Room: Edison   Poster Session

AUTHORS:   Guangzheng Yu and Bosun Xie,  South China University of Technology

TITLE:   Multiple sound sources solution for near-field head-related transfer function measurements

Near-field head-related transfer functions (HRTFs) are essential to scientific researches of binaural hearing and practical applications of virtual auditory display. In contrast to far-field HRTFs measurement, the near-field measurement is more difficult, because high efficiency, accuracy and repeatability are required in a near-field HRTF measurement. When multiple sound sources are adopted to accelerate the near-field HRTF measurement, multiple scattering among sound sources could be the major factor to influence the accuracy. In present work, the well-designed multiple sound sources solution is used to accelerate the near-field HRTFs measurement. Results show that the error caused by the multiple scattering among sound source can be controlled within the acceptable values through reasonable design and arrangement for sound sources.

5:00 - 7:00   Monday, August 20   Room: Edison   Poster Session

AUTHORS:   Yun-Han Wu, Scott Murakami and Agnieszka Roginska,  New York Univeristy

TITLE:   Comparison of Measured and Simulated Room Impulse Response for an Interactive Music Making Environment in Mixed Reality

This paper describes a musical environment created for mixed reality (MR) where virtual sound objects are superimposed on the real environment with the goal to blend seamlessly with the real environment using acoustic simulation techniques. The resulting experience allows a participant to interact with and create music by playing the virtual objects. An informal subjective study is performed where subjects are asked to evaluate the acoustics in the scene, with the different techniques applied, based on preference and the evaluation of the interaction design with the musical objects.

5:00 - 7:00   Monday, August 20   Room: Edison   Poster Session

AUTHORS:   Marco Binelli, Daniel Pinardi and Angelo Farina,  University of Parma, Italy, Tiziano Nili,  ASK Industries, Italy

TITLE:   Individualized HRTF for playing VR videos with Ambisonics spatial audio on HMDs

Current systems for rendering 360-degrees videos with spatial audio on HMDs rely on a binaural approach combined with Ambisonics technology. These head-tracking systems employ generic HRTFs typically measured with a dummy head in an anechoic room. In this paper, we describe a new solution that has been developed to play 360-degrees video files with spatial audio for desktop and portable platforms, based on existing open source software. The HRTF set can be loaded from a standard WAV file chosen in an existing database or from an ad-hoc measurement or simulation. The capability to switch among multiple HRTF sets while playing has been added.

5:00 - 7:00   Monday, August 20   Room: Edison   Poster Session

AUTHORS:   Alan Kan,  University of Wisconsin-Madison

TITLE:   On high-frequency interaural time difference sensitivity in complex auditory environments

The Duplex theory of sound localization is a useful principle for guiding trade-offs between realistic production vs implementation complexity for virtual auditory environments. However, there are exceptions to this theory in the psychoacoustic literature that should be noted. One exception is that sensitivity to an interaural time difference (ITD) can be improved when either low-frequency amplitude modulation (AM) or frequency modulation (FM) is introduced into a high frequency tone. This paper presents results from a psychoacoustic experiment that show that having both AM and FM greatly improves sensitivity to high-frequency ITDs compared to AM or FM alone when presented in incoherent broadband noise. The implications of these findings to the generation of virtual auditory environments are discussed.

5:00 - 7:00   Monday, August 20   Room: Edison   Poster Session

AUTHORS:   Nail Gumerov, Dmitry Zotkin, Adam O'Donovan and Ramani Duraiswami,  VisiSonics Corporation

TITLE:   Spatial Acoustic Field Simulation as a Service

Many practical applications in VR and AR require the computation of scattered acoustical fields from complex shaped objects. Such computations arise, e.g., in computing head related transfer functions and in designing microphone arrays. We present a parallelized fast-multipole accelerated boundary-element solver which can be used to efficiently compute the solution to such problems in the cloud. This builds on our extensive previous work on developing such methods. Details of the methods, and results from problems relevant to audio in AR and VR will be presented.

5:00 - 7:00   Monday, August 20   Room: Edison   Poster Session

AUTHORS:   Hanne Stenzel and Philip J.B. Jackson,  University of Surrey, UK

TITLE:   Perceptual thresholds of audio-visual spatial coherence for a variety of audio-visual objects

Audio-visual spatial perception relies on the integration of both auditory and visual spatial information. Depending on auditory and visual features of the stimulus, and the relevance of each sound to the listener, offsets between both signals are more or less acceptable. The current paper investigates to which extent each of these factors influences how critical the perception of spatial coherence is by estimating the psychometric function for seventeen audio-visual stimuli. The results show that the maximum accepted offset angle does not depend on semantic categories but is linked to audio feature classes with harmonic sounds leading to greater acceptable offsets. A regression shows that the perceptual spectral centroid is negatively correlated with the offset angle and the slope of the psychometric spatial-coherence function. This finding, however, is not conclusive and further research is necessary to define all parameters that influence bimodal localization of realistic stimuli.

5:00 - 7:00   Monday, August 20   Room: Edison   Poster Session

AUTHORS:   Rishabh Gupta, Rishabh Ranjan, and Woon-Seng Gan,  Nanyang Technological University, Singapore, Jianjun He,  Maxim Integrated Product Inc

TITLE:   Investigation of the Effect of VR/AR Headgears on Head-Related Transfer Functions for Natural Listening

With the advent of Virtual/Augmented/Mixed Reality (VR/AR/MR) applications and accompanied head mounted displays (HMD), these devices are becoming increasingly common in daily life use producing immersive virtual and augmented audio-visual content. However, presence of these HMDs can affect the acoustic propagation of the sound waves from the sound sources to the human ears, resulting in changes in head-related transfer functions (HRTFs). In this paper, we conducted the measurements of HRTFs with and without these headgears and investigated the effect of these headgears on HRTFs at different source directions using descriptive analysis and objective metrics. Furthermore, subjective listening tests were conducted to study the perceptual significance of these differences in terms of timbre and spatial performance.

5:00 - 7:00   Monday, August 20   Room: Edison   Poster Session

AUTHORS:   Raimundo Gonzalez and Tapio Lokki,  Aalto University, Joshua Pearce,  Michigan Technological University

TITLE:   Modular Design for Spherical Microphone Arrays

Spherical microphones arrays are commonly utilized for recording, analyzing and reproducing sound-fields. In the context of higher-order Ambisonics, the spatial resolution depends on the number and distribution of sensors over the surface of a sphere. Commercially available arrays have set configurations that cannot be changed, which limits their usability for experimental and educational spatial audio applications. Therefore, an open-source modular design using MEMs microphones and 3D printing is proposed for selectively capturing frequency-dependent spatial components of sound-fields. Following a modular paradigm, the presented device is low cost and decomposes the array into smaller units (a matrix, connectors and microphones), which can be easily rearranged to capture up to third-order spherical harmonic signals with various physical configurations.

Paper Abstracts for Tuesday, August 21


11:30 - 11:55   Tuesday, August 21   Room: Michelangelo,   Paper Session: HRTF Personalization

AUTHORS:   David Poirier-Quinot and Brian Katz,  Sorbonne Université, CNRS, Institut Jean Le Rond d'Alembert, France

TITLE:   Impact of HRTF individualization on player performance in a VR shooter game II

We present the extended results of a previous experiment to assess the impact of individualized binaural rendering on player performance in the context of a VR ''shooter game''. Participants played a game in which they were faced with successive enemy targets approaching from random directions on a sphere. Audio-visual cues allowed for target localization. Participants were equipped with an Oculus CV1-HMD, headphones, and two Oculus Touch hand tracked devices as targeting mechanisms. Participants performed six sessions alternatively using their best and worst-match HRTFs from a ''perceptually orthogonal'' optimized set of 7 HRTFs [Katz2012]. Results suggest that the impact of the HRTF on participant performance (speed and movement efficiency) depends both on participant sensitivity and HRTF presentation order.

12:00 - 12:25   Tuesday, August 21   Room: Michelangelo,   Paper Session: HRTF Personalization

AUTHORS:   Rishi Shukla, Rebecca Stewart, Mark Sandler,  Queen Mary University of London, UK, Agnieszka Roginska,  New York Univeristy

TITLE:   User selection of optimal HRTF sets via holistic comparative evaluation

If well-matched to a given listener, head-related transfer functions (HRTFs) that have not been individually measured can still present relatively effective auditory scenes compared to renderings from individualised HRTF sets. We present and assess a system for HRTF selection that relies on holistic judgements of users to identify their optimal match through a series of pairwise adversarial comparisons. The mechanism resulted in clear preference for a single HRTF set in a majority of cases. Where this did not occur, randomised selection between equally judged HRTFs did not significantly impact user performance in a subsequent listening task. This approach is shown to be equally effective for both novice and expert listeners in selecting their preferred HRTF set.

12:30 - 12:55   Tuesday, August 21   Room: Michelangelo,   Paper Session: HRTF Personalization

AUTHORS:   Ramani Duraiswami, Nail Gumerov, Adam O'Donovan and Dmitry Zotkin,  VisiSonics Corporation, Justin Shen, Matthias Zwicker,  University of Maryland

TITLE:   Large Scale HRTF personalization

Many applications in creating relistic audio for augmented and virtual reality require individual head-related transfer functions. Typically individual HRTFs are measured typically via a slow and relatively tedious procedure, making them unusable in practical applications. We discuss two approaches that allow the creation of HRTFs at large scale for applications. The first is a fast reciprocal measurement approach which allows fast measurement. We have applied the technique to measure the HRTF of several individuals. The second approach is based upon computation of the HRTF in the cloud using meshes in a few minutes. We have shown this approach works with 3D scans, and are now extending it to photograph based processing.

2:00 - 2:25   Tuesday, August 21   Room: Michelangelo,   Paper Session: Ambisonics

AUTHORS:   Fernando Lopez-Lezcano,  CCRMA/Stanford University

TITLE:   The *SpHEAR project update: the TinySpHEAR and Octathingy soundfield microphones

This paper in an update of the *SpHEAR (Spherical Harmonics Ear) project, created with the goal of using low cost 3D printers to fabricate Ambisonics microphones. The initial four-capsule prototypes reported in 2016 have evolved into a family of full-featured high quality microphones that include the traditional tetrahedral design and a more advanced eight capsule microphone that can capture second-order soundfields. The project includes all mechanical 3d models and electrical designs, as well as all the procedures and software needed to calibrate the microphones for best performance. A fully-automated robotic arm measurement rig is also described. Everything in the project is shared through GPL/CC licenses, uses Free Software components, and is available on a public GIT repository (

2:30 - 2:55   Tuesday, August 21   Room: Michelangelo,   Paper Session: Ambisonics

AUTHORS:   Michael Goodwin,  Xperi, Inc / DTS

TITLE:   A hybrid beamforming framework for B-format encoding with arbitrary microphone arrays

In this paper we consider the problem of B-format encoding of live audio scenes captured using practical compact microphone arrays with unconstrained microphone locations. We formulate the problem and provide a mathematical framework for a hybrid adaptive beamformer which combines active beamforming based on frequency-domain spatial analysis and synthesis for estimated directional audio sources and least-squares passive beamforming for residual audio scene components. The necessary calibration measurements are described and algorithmic details for efficient implementation are given.

3:00 - 3:25   Tuesday, August 21   Room: Michelangelo,   Paper Session: Ambisonics

AUTHORS:   Calum Armstrong, Damian Murphy and Gavin Kearney,  University of York, UK

TITLE:   A Bi-RADIAL Approach to Ambisonics

This paper introduces Binaural Rendering of Audio by Duplex Independent Auralised Listening (Bi-RADIAL), a new scheme for the reproduction of 3D sound over headphones. Principles, considerations and methods of Bi-RADIAL decoding and their application within the binaural rendering of Virtual Ambisonics is discussed. Three methods of delivering Bi-RADIAL Ambisonics are compared and the advantages of exploiting a Bi-RADIAL scheme over traditional binaural-based Ambisonic playback are highlighted. Simulation results for standard and Bi-RADIAL Ambisonic decoders are given using a perceptual based comparison of their frequency spectra. Analysis is made for 1st, 3rd and 5th order decoders considering both basic and maxrE weighting schemes. Results show a clear improvement in spectral reconstruction when using a Bi-RADIAL decoder

4:00 - 4:25   Tuesday, August 21   Room: Michelangelo,   Paper Session: Binaural Rendering of 3D Sound Fields

AUTHORS:   César Salvador, Shuichi Sakamoto, Jorge Trevino and Yôiti Suzuki,  Tohoku University, Japan

TITLE:   Enhancing binaural reconstruction from rigid circular microphone array recordings by using virtual microphones

Spatially accurate binaural reconstruction from rigid circular arrays requires a large number of microphones. However, physically adding microphones to available arrays is not always feasible. In environments such as conference rooms or concert halls, prior knowledge regarding source positions allows for the prediction of pressure signals at positions without microphones. Prediction is performed by relying on a physical model for the acoustically rigid sphere. Recently, we used this model to formulate a surface pressure interpolation method for virtual microphone generation. In this study, we use virtual microphones to enhance the high-frequency spatial accuracy of binaural reconstruction. Numerical experiments in anechoic and reverberant conditions demonstrate that adding virtual microphones extends the frequency range of operation and attenuates the time-domain artifacts.

4:30 - 4:55   Tuesday, August 21   Room: Michelangelo,   Paper Session: Binaural Rendering of 3D Sound Fields

AUTHORS:   Axel Plinge,  Frahunhofer IIS, Germany, Sebastian Schlecht, Oliver Thiergart, Thomas Robotham, Olli Rummukainen and Emanuel Habets,  International Audio Laboratories, Erlangen, Germany

TITLE:   Six-Degrees-of-Freedom Binaural Audio Reproduction of First-Order Ambisonics with Distance Information

First-order Ambisonics (FOA) recordings can be processed and reproduced over headphones. They can be rotated to account for the listener's head orientation. However, virtual reality (VR) systems allow the listener to move in six-degrees-of-freedom (6DoF), i.e., three rotational plus three transitional degrees of freedom. Here, the apparent angles and distances of the sound sources depend on the listener's position. We propose a technique to facilitate 6DoF. In particular, a FOA recording is described using a parametric model, which is modified based on the listener's position and information about the distances to the sources. We evaluate our method by a listening test, comparing different binaural renderings of a synthetic sound scene in which the listener can move freely.

5:00 - 5:25   Tuesday, August 21   Room: Michelangelo,   Paper Session: Binaural Rendering of 3D Sound Fields

AUTHORS:   Thomas McKenzie, Damian Murphy and Gavin Kearney,  University of York, UK

TITLE:   Directional Bias Equalisation of First-Order Binaural Ambisonic Rendering

The human auditory system is more accurate at localising sounds in front than at lateral, rear and elevated directions. In virtual reality applications, where Ambisonic audio is presented to the user binaurally (over headphones) in conjunction with a head-mounted display, it is imperative that audio in the frontal direction is as accurate as possible. This paper presents a method for improving frontal high frequency reproduction of binaural Ambisonic rendering through a novel adaptation of the diffuse-field equalisation technique to exploit the non-uniform directional sensitivity of human hearing. The method is evaluated via spectral difference and a height localisation model, and results show improved frontal reproduction at the expense of lateral fidelity.

Paper Abstracts for Wednesday, August 22


11:30 - 11:55   Wednesday, August 22   Room: Michelangelo,   Paper Session: Virtual Acoustics & Environment Modeling

AUTHORS:   Sara R. Martin and U. Peter Svensson,  Acoustics Research Centre, Dep. of Electronic Systems, Norwegian University of Science and Technology, Trondheim, Norway

TITLE:   Modeling sound sources with non-convex shapes using an edge diffraction approach

This paper explores the modeling of sound radiation from vibrating structures, representing the acoustic environment with Green's functions. A fictive convex hull is created that encloses the vibrating structure, and different subdomains will be created at the structure's indents. Each boundary between those subdomains and the convex exterior is then discretized, employing "virtual pistons". The radiation impedance of those virtual pistons can be computed efficiently for the external convex domain with the edge diffraction model in [J. Acoust. Soc. Am. 133, pp. 3681-3691, 2013]. The impedance contributions of the indenting subdomains can be computed theoretically, by means of modal analysis or by any common numerical method. An open shoebox-shaped object is presented and analyzed as an example.

12:00 - 12:25   Wednesday, August 22   Room: Michelangelo,   Paper Session: Virtual Acoustics & Environment Modeling

AUTHORS:   Anne Heimes, Muhammad Imran and Michael Vorländer,  Institute of Technical Acoustics RWTH University, Germany

TITLE:   Real-Time Building Acoustics Auralization of Virtual Environments

In this study we propose a framework for auralization of sound transmission in virtual buildings and develop the building acoustics filters. This paper describes the calculations for airborne sound insulation metrics based on ISO-12354 and comprehends the auralization process of sound transmission through the flaking paths of building elements. The insulation filters describe the sound transmission between dwellings by partitions and by flanking structures to estimate the transfer functions between the sources and receivers. In Unity 3D, a virtual scenario is created that consists of a multi-storey resident flat with different types of building elements. The sound insulation filters are implemented into virtual scenario to introduce more realism in auralization of buildings acoustics in virtual spaces.

12:30 - 12:55   Wednesday, August 22   Room: Michelangelo,   Paper Session: Virtual Acoustics & Environment Modeling

AUTHORS:   Keith Godin, Ryan Rohrer, John Snyder and Nikunj Raghuvanshi,  Microsoft

TITLE:   Wave Acoustics in a Mixed Reality Shell

We demonstrate the first integration of wave acoustics in a virtual reality operating system. The Windows mixed reality shell hosts third-party applications inside a 3D virtual home, propagating sound from these applications throughout the environment to provide a natural user interface. Rather than applying manually-designed reverberation volumes or ray-traced geometric acoustics, we use wave acoustics which robustly captures cues like diffracted occlusion and reverberation propagating through portals while reducing the design and maintenance burden. We describe our rendering implementation, materials-based design techniques, reverberation tuning, dynamic range management, and temporal smoothing that ensure a natural listening experience across unpredictable audio content and user motion.

2:00 - 2:25   Wednesday, August 22   Room: Michelangelo,   Paper Session: Applications in VR/AR

AUTHORS:   Jonathan Mathews and Jonas Braasch,  Rensselaer Polytechnic Institute

TITLE:   Real-Time Source-Tracking Spherical Microphone Arrays for Immersive Environments

Spherical microphone arrays have attained considerable interest in recent years for their ability to decompose three-dimensional soundfields. This paper details real-time capabilities of a source-tracking system composed of a beamforming array and multiple lavalier microphones. Using the lavalier microphones for source identification, a particle filter can be implemented to allow independent tracking of the orientation of multiple sources simultaneously. This source identification and tracking mechanism is utilized in an immersive lab space. In conjunction with networked audiovisual equipment, the system can generate a real-time virtual representation of sound sources for a more dynamic telematic experience.

2:30 - 2:55   Wednesday, August 22   Room: Michelangelo,   Paper Session: Applications in VR/AR

AUTHORS:   Rishabh Gupta, Rishabh Ranjan, Woon-Seng Gan,  Nanyang Technological University, Singapore, Jianjun He,  Maxim Integrated Product Inc

TITLE:   On the use of closed back headphones for active hear-through equalization in augmented reality applications

Augmented Reality (AR) audio refers to techniques where virtual sounds are superimposed with real sounds to produce immersive digital content. Headphones are widely used in consumer devices for playback of virtual sounds. However, for AR audio, an important step is to make sure that headphones allow external sounds to pass through naturally using Hear-Through (HT) processing. In this paper, an investigation of HT design for real sounds using closed-back circumaural headphones equipped with two pairs of microphones was conducted. An adaptive filtering algorithm was used to derive the equalization filter. Experimental result shows close match of the equalized signal to reference open ear listening. Subjective study was carried out to compare the spatial and timbre sound quality of HT mode.

3:00 - 3:25   Wednesday, August 22   Room: Michelangelo,   Paper Session: Applications in VR/AR

AUTHORS:   Karlheinz Brandenburg,  Fraunhofer IDMT & Technical University Ilmenau, Germany, Estefanía Cano, Hanna Lukashevich, Thomas Köllmer,  Fraunhofer IDMT, Ilmenau, Germany, Annika Neidhardt, Florian Klein, Ulrike Sloma and Stephan Werner,  Technical University Ilmenau, Germany

TITLE:   Plausible Augmentation of Auditory Scenes Using Dynamic Binaural Synthesis for Personalized Auditory Realities

Personalized Auditory Realities (PARty) have been introduced at a recent conference (Karlheinz Brandenburg et al.: “Personalized Auditory Realities”, DAGA 2018, Garching, Germany). In this proposed system, a real auditory environment is analyzed to find separate audio objects. These may be isolated using source separation techniques. Such objects can then be manipulated, virtual sound objects added and the whole scene can be rendered to fit into the real auditory environment. This paper presents the basic idea of PARty and then focuses on the dynamic binaural synthesis system which is used to reproduce ear signals fitting into the real environment. The viability of the system is evaluated for the features localization, externalization, and overall quality in terms of a spatial auditory perception.

3:30 - 3:55   Wednesday, August 22   Room: Michelangelo,   Paper Session: Applications in VR/AR

AUTHORS:   Andrea Genovese, Gabriel Zalles, Gregory Reardon and Agnieszka Roginska,  New York University

TITLE:   Acoustic perturbations in HRTFs measured on Mixed Reality Headsets

Materials that obstruct the path of acoustic waveforms in free-field to the human ear, may introduce distortions that can modify the natural Head-Related Transfer Functions. In this paper, the effect of wearing commercially available Head-Mounted Displays for Mixed and Augmented Reality has been measured via a dummy head mannequin. Such spectral distortions may be relevant for mixed reality environments where real and virtual sounds mix together in the same auditory scene. The analysis revealed that the measured HMDs affected the fine structure of the HRTF (> 3-6 kHz) and also introduced non-negligible distortions in the interaural level difference range mostly at the contralateral ear. Distortion patterns in HRTFs and cue modifications are reported and discussed.

4:00 - 4:25   Wednesday, August 22   Room: Michelangelo,   Paper Session: Applications in VR/AR

AUTHORS:   Zeynep Özcan,  Istanbul Technical University, Anı l Çamcı,  University of Michigan

TITLE:   An Augmented Reality Music Composition Based on the Sonification of Animal Behavior

In this paper, we discuss the immersive sonification of an artificial ecosystem in the form of an interactive sound art piece, named Proprius. We first offer an overview of existing work that utilizes ecological models in compositional and sonification contexts. We then describe the behavioral and ethological models utilized in Proprius. We evaluate the musical characteristics of animal behaviors, and discuss our approach to sonifying them in the context of an interactive augmented reality composition. We provide details of our system in terms of how it implements ecological simulation, immersive audio, and embodied interaction.

5:00 - 5:25   Wednesday, August 22   Room: Michelangelo,   Paper Session: HRTF Modeling

AUTHORS:   Christopher Buchanan,  Signum Audio, UK, Michael Newton,  Acoustics & Audio Group, University of Edinburgh, UK

TITLE:   Dynamic Balanced Model Truncation of the Spherical Transfer Function For Use in Structural HRTF Models

The Spherical Transfer Function (STF) has previously been used in structural HRTF modelling as an analytical approximation to the human head. Versions based on both spherical and spheroidal solid bodies have been incorporated into a range of systems, such as the well known Brown & Duda [Brown 1998] structural model. STF-based models provide a way to simulate frequency dependent head shadowing (ILD) and time delay (ITD) effects, which can form the foundation for structural HRTF representation. We derive and implement a customizable approximation of the STF based on Balanced Model Truncation, and utilize its inherent modular characteristics to synthesize binaural signals from monaural input with relatively low cost implications.

5:30 - 5:55   Wednesday, August 22   Room: Michelangelo,   Paper Session: HRTF Modeling

AUTHORS:   David Romblom and Helene Bahu,  Dysonics, San Francisco

TITLE:   A Revision and Objective Evaluation of the 1-Pole 1-Zero Spherical Head Shadowing Filter

Structural models of Head Related Transfer Functions attempt to decompose complex acoustic phenomena into constituent signal processing models. The work of Brown and Duda modeled the head with two elements: a pure delay estimated from a ray-tracing formula by Woodworth and a 1-pole 1-zero shadowing filter. The ray-tracing formula is valid for frequencies above ˜2kHz while interaural time delay (ITD) is perceptually significant below ˜1.5kHz. The frequency-dependence of the phase-derived ITDp has been shown by many authors but is not accounted for by ray-tracing assumptions. As such, the shadowing filter must account for the low frequency variation in ITDp. Using a numerical approximation of Rayleigh's solution as a reference, this paper evaluates and revises the work of Brown and Duda.

6:00 - 6:25   Wednesday, August 22   Room: Michelangelo,   Paper Session: HRTF Modeling

AUTHORS:   Fabian Brinkmann and Stefan Weinzierl,  Technical University of Berlin, Germany

TITLE:   Comparison of head-related transfer functions pre-processing techniques for spherical harmonics decomposition

Head-related transfer functions (HRTFs) are the basis for virtual auditory reality, and a larger number of HRTFs is required for a perceptually transparent representation. Consequently, HRTFs are commonly reconstructed from a reduced data set by means of interpolation. The spherical harmonics transform (SHT), which decomposes HRTF sets into weighted orthogonal basis functions, is an particularly promising approach for this, because it yields a spatially continuous HRTF representation. Because the SHT has to be order truncated in practice, it is of interest to find HRTF representations that concentrate the HRTF energy at low SH orders. This study compares previous approaches to a decomposition based on time aligned HRIRs for their suitability to reduce the required SH order.

6:30 - 6:55   Wednesday, August 22   Room: Michelangelo,   Paper Session: HRTF Modeling

AUTHORS:   Faiyadh Shahid, Nikhil Javeri, Kapil Jain and Shruti Badhwar,  EmbodyVR

TITLE:   AI DevOps for large-scale HRTF prediction and evaluation: an end to end pipeline

Bringing truly immersive 3D audio experiences to the end user requires a fast and a user friendly method of predicting HRTFs. While machine learning based approaches for HRTF prediction hold potential, it can be challenging to determine the best workflow for deployment given the iterative nature of data preprocessing, feature extraction, prediction and performance evaluation. Here, we describe an automated, end to end pipeline for HRTF prediction and evaluation that simultaneously tracks the data, code and model, allowing for a comparison of existing and new techniques against a single benchmark.