The basic concept is centred on the fact that environmental sound follows us around wherever we go, whether we notice this or not. The piece itself will replicate this effect with a combination of motion tracking and multi-channel surround sound. We will use infrared light technology in combination with IR sensitive camera technology to track the position of people within a space. This tracking will be applied to sounds in such a way that when a person enters the space a sound is assigned to them, this sound will then follow them around the space as they move through it.
Hardware Overview
The performance space is arranged with an eight channel audio system surrounding a circular area of approximately 12 feet in diameter.
The performance space is evenly lit by infrared light emitting technology which is arranged above the performance space and consists of a series of Tungsten halide lighting units which are individually modified with visual spectrum reduction filters.
Tungsten Halide lighting emits a significant amount of IR light, therefore, by filtering the visual spectral output of each Tungsten Halide light the space can be efficiently lit with light in the Infrared spectrum.
The IR light which is produced is reflected, in part, by the people standing within the performance space and can be detected by a high frame rate video camera which is modified to be specifically sensitive to those wavelengths above the visual spectrum.
Computer-aided frame-by-frame analysis of video recordings is used to plot the instantaneous positions of the targets within the space.
The video data goes through several specific processes before the instantaneous positions of the targets within the space can be identified:
A still image of the background is taken with no targets in the space; this background image is computationally subtracted on a frame-by-frame basis from the incoming video data resulting in video data which represents the global changing target positions.
The resulting image is converted to grayscale format; this allows this allows us to discard unneeded colour information and to apply a threshold based analysis. This threshold analysis makes it possible for us to discard unwanted targets that may appear less brightly lit.
The data which passes above the threshold is kept and represents the global target positions; further processing includes a simple noise reduction process and amplification of the resulting “white areas” which represent the individually tracked targets.
The instantaneous target positions are transmitted, via the TUIO 2Dcur protocol, to Max/Msp for further processing.
Software Design
Design Goals
The software required to allow for motion tracking based spatialisation in this particular installation has been custom designed for the current application with reference to several desirable parameters. The considerations for desirable parameters are as follows. Firstly the playback system is dictated as an eight channel surround sound system. This type of system was chosen over more the traditional 5.1 Dolby Surround setup as the additional speakers improve fidelity, since speakers work more efficiently when not taxed by overly complex signals [1] & [2].
Particular focus is placed on aesthetic issues by hiding the amplitude control variables required for spatialisation from the user, presenting instead a space through which the user can move, and such motion directly relates to the spatialisation variables. This was deemed an important aspect of the design parameters as exposing the software architecture to the user may result in additional user effort and error conditions [3]. This process is realised by linking the amplitude controls of all eight audio outputs, via algorithms, which are in turn related to the user exclusively via motion throughout the performance space.
Software Outlined
The XY co-ordintes of each individual within the defined space are routed to Max/Msp via a series of computational processes which are graphically outlined in the above figure. The result is that each individual in the space produces an individual XY co-ordinate pair which is directly relative to their position within the space. This positional data is further processed to allow the relative control of eight audio output channels, each with a dedicated loudspeaker, arranged as shown in the below figure.
The software processing algorithms that are required to relate the axial motion data of each individual, to the amplitude output in each speaker, is shown diagrammatically in Figure 3. In order to maintain clarity the process required for a single axis, X, is shown and followed by a brief explanation.
References
[1] Rolfe, C. (1999) A Practical Guide to Diffusion. [online] Available from: www.naisa.ca/soundtravels/2003/difpract.html [Accessed date: 17/04/2010].
[2] White, P. et al (2002) You Are Surrounded: Surround Sound Explained – Part 6. Sound on Sound Magazine, January 2002. UK.
[3] Baudisch, P. et al (2004) Flat Volume Control: Improving Usability by Hiding the Volume Control Hierarchy in the User Interface. Microsoft Research Papers, Volume 6, Number 1. Vienna, Austria
[4] Howard, D. and Angus, J. (2001) Acoustics and Psychoacoustics, Second Edition. Oxford: Focal Press.
[5] West, J. (1998) IID-based Panning Methods: Rationale for IID-based Panning. University of Miami. [online] Available from: http://mue.music.miami.edu/thesis/jwest/Chap_3/Chap_3_IID_Based_Panning_Methods.html [Access date: 17/04/2010].