The following are excerpts of an article entitled 'Hearing Musical Streams' , published by Stephen Mc Adams from Stanford University School of Medicine (he was one of my teachers at the I.R.C.A.M. in Paris ) and Albert Bergman (Dept. of Psychology at Mc Gill University) regarding auditory stream perception of sequential musical events.
The perceptual effects of a sound are dependent upon the musical context in which that sound is embedded : i.e a given sound's perceived pitch, timbre and loudness are influenced by the sounds that precede it, coincide with it and even follow it in time. Thus, this context influences the way a listener will associate the sound with various melodic, rhythmic, dynamic, harmonic and timbral structures within the musical sequence.
Leon Van Noorden has stated that "in sequences where the tones follow one another in quick succession, effects are observed which indicate that the tones are NOT processed individually by the human perception system. Indeed, we find various types of mutual interactions between successive tones, such as forward and backward masking, loudness interactions and duration interactions. As for simultaneous sonic events, Bergman has suggested that different sounds are extracted according to various perceptual and cognitive organizational mechanisms from the superimposed acoustic vibrations.
What is an auditory stream?
Auditory stream formation theory is concerned with how the auditory system determines whether a sequence of acoustic events results from one, or more than one, "source". A physical "source" may be considered as some sequence of acoustic events emanating from one location. A "stream" is a psychological organization that mentally represents such a sequence and displays a certain internal consistency or continuity, that allows the sequence to be interpreted as a whole.
By way of example, two possible perceptual organizations of a repeating six-tones sequence are illustrated in Figure 1. (Note ,that Time is represented on the horizontal axis, Frequency is represented on the vertical axis and the thin lines connecting the tones in the figure indicate the stream percepts).
In the first configuration (Figure 1a), a repeating 6 tone sequence composed of interspersed high and low tones (A1, B2, E2, D3, C2 and F2) having a clock frequency of 5Hz, is heard as ONE perceptual stream.
Interestingly, when the clock frequency is set to 10 Hz, the high tones perceptually segregate from the low tones to form TWO separate three-tone streams. (see Figure 1b)
Notice that one can pay attention to either the higher or lower stream, switching between them at will, but that it is not possible to concentrate to both simultaneously: the human perceptual system assumes things are coming from one source, until it acquires enough information to suggest an alternate interpretation..
Due to the competition among stream organization, tone F may be perceived as belonging to either the HIGHER stream or the LOWER stream but NOT to both... (see Figure2a)
If tone F is connected subliminally to the lower stream, then the rhythm of the upper stream changes dramatically. (see Figure 2b)
Frequency and Tempo
Although a sequence of tones might appear to be coherent or integrated at a very low tempo, things change when the tempo of the sequence is increased. The faster the tempo, the greater the degree of breakdown or decomposition of the tones into narrower streams: when the tempo is very high it seems that every given frequency is beating along in its own stream.
Thus, the particular relationship between frequencies in a tonal pattern - and not just the frequency separation between adjacent tones- plays a vital role in the formation of streams. The faster the tones follow one another, the smaller the frequency separation at which they segregate into separate perceptual streams. Conversely, the greater the frequency separation, the slower the tempo at which segregation occurs.
In Figure 5 the upper curve is the temporal coherence boundary. Above this boundary, it is impossible to integrate the two alternating tones into one stream. Below this boundary lies the temporal coherence region, where it is possible to integrate events into a single perceptual stream. The lower boundary is the fission boundary: below this it is impossible to hear more than one stream.
Note that the region of fission and coherence also overlap, creating an ambiguous region where either percept may be heard.
Frequency trajectories
Another frequency-based effect involves frequency trajectories (see Figure 6). There are three types of frequency transitions between tones. In the top section the tones are completely connected by a frequency glissandi. In the middle section , an interrupted glissando is directed towards the succeeding tone. No frequency glide occurs in the bottom section: the first tone ends on one frequency and the next tone begins on another.
Note: Tones connected by glissandi are much less likely to segregate under given conditions of tempo and frequency separation than those which makes abrupt frequency transitions.
Frequency proximity may sometimes compete with trajectory organization: simultaneous ascending and descending scales patterns presented to opposite ears tend to segregate (see Figure 7). They are perceived into upright and inverted 'V' shaped melodic contours rather than the complete ascending and descending scale patterns.
Timbre separation
When the timbre of a sequence of tones is uniform and coherent , one hears only one stream (see lower section of Figure 8). If the lower part of the sequence is made out of tones produced by a sine wave (no harmonics) and the upper part is made out of tones containing harmonics (in this example, a square waveform) : the sequence will be incoherent and spread into two streams. (see upper section of Figure 8)
In general, the more timbre segregation there is in a given sequence, the more streams will be generated.
Loudness separation
When the loudness of tones in a sequence is uniform and coherent , one hears only one stream. Obviously, if one or two tones in the sequence are louder than the rest of the tones, the tones of the sequence will be divided into two streams.
Masking effects
The edge of a louder sound and a contiguous softer sound might be used to induce the perception that the less prominent sound is partially occluded or masked. by the louder sound. In such a case, the illusion of continuity of the softer sound "behind" the louder sound may result. This has been found to be true for sine tones masking sine tones, noise masking sine tones, and noise masking noise.
Copyright 1979 by Stephen Mc Adams and Albert Bergman.