Andreas Walther
As audio reproduction systems evolve to incorporate many loudspeakers, the potential to evoke faithful auditory spatial impressions increases. Psychoacoustic principles of auditory spatial perception, and methods for production and reproduction of acoustic scenes, have been the subject of extensive research. This thesis contributes to both fields. It investigates effects of realistic stimuli in psychoacoustic experiments, and proposes methods for performance assessment and signal enhancement for three-dimensional multi-channel reproduction. Besides the sound arriving directly from the source, reflected sound arriving shortly after the direct sound influences directional aspects of spatial impressions. Related psychoacoustic phenomena – summarized under the term precedence effect – perceptually fuse the reflected sound with the direct sound. When the precedence effect breaks down, reflections are perceived as echoes. Chapter 3 comprises a series of psychoacoustic experiments that examined the upper limit of the precedence effect by measuring the echo threshold under reflection conditions with (a) varying spectral content, and (b) varying temporal structure. Results indicate that: (a) spectral differences between direct sound and reflected sound have substantial influence on the echo threshold; (b) in many conditions, temporal diffusion in the reflected sound does not affect the echo threshold, albeit for a speech signal, a temporally diffuse reflection that is spatially separated from the direct sound is less easily detectable as a separate auditory event than a reflection of equal total energy that is identical to the direct sound. Reflected sound arriving late after the direct sound influences the listener’s feeling of being enveloped by sound. Late reverberation shows diffuse field characteristics, since uncorrelated sound arrives from many directions. In a diffuse sound field, the two ear signals exhibit a specific frequency dependent interaural correlation. Chapter 4 presents a psychoacoustic experiment to obtain just noticeable differences in interaural correlation from diffuse field reference correlations. The measured discrimination thresholds show a distinct frequency dependent asymmetry for deviation from the reference towards the positive and negative correlation range. This indicates the importance of considering both the positive and negative correlation range for analyses based on interaural correlation measurement. Chapter 5 proposes an extension of cross-correlation based diffusion assessment. Using the discrimination thresholds obtained in Chapter 4, it aims at estimating the perceived diffuseness of a sound field reproduced by different multi-channel loudspeaker arrangements for varying listener positions and orientations. Visualizations of the diffuse field reproduction capabilities of different multi-channel setups illustrate the improvement achievable with an increasing number of loudspeakers and three-dimensional loudspeaker arrangements. In the absence of an accepted universal standard for multi-channel transport and playback (beyond 5.1 surround), various proprietary solutions exist. There is a demand to ensure compatibility between different formats and to enable adaptability of legacy formats. Chapter 6 establishes a signal processing method to identify and estimate diffuse signal components in multi-channel input signals. Using the decomposed signal, a strategy is proposed to convert legacy surround sound content to 3D formats with an increased number of signal channels.
EPFL2013