PS1061: Sensation and Perception 2013
Term 2,      Thursday 11 am - 1 pm (Boiler House)

Lecture 6: Auditory perception: Hearing noise and sound

Course co-ordinator: Johannes M. Zanker, j.zanker@rhul.ac.uk, (Room W 246)


Lecture Topics


questions about hearing (auditory perception)

what is the significance of auditory perception for a person in real life?

what are some of the most interesting problems in auditory perception ?

the nature of sound

a sound source is emitting circular pressure waves (shells of air compression)
a sound source is emitting circular pressure waves (shells of air compression) sound waves are similar to the radiating ripples on the water surface, when a pebble is tossed into a still pond
a pure tone is represented by a sinewave (air pressure as function of space/time) which is travelling through space,
with amplitude and frequency (1/period) corresponding to perceived loudness and pitch

amplitude = distance between peak and trough of  sinewave (overall size pressure changes)
frequency = number of pressure changes per unit of time (inverse of sinewave period, measured in Hz = cycles per second)

making music: the scale


notes of a musical score refer to the keys on the piano -> the frequency generated -> the pitch of a musical tone



       C         D         E          F          G         A          B        C

 

 

keys are arranged on the keyboard in the order of rising frequency of the musical tone generated (e.g. C-major diatonic scale)

 

 

 

click for tone samples

harmonic intervals are determined by characteristic frequency ratios



combining frequencies : pitch

frequency & amplitude of pure tones are the basis of perceived pitch and loudness

what happens to the waveform when you superimpose pure tones ?

combined waveforms: perceived pitch usually is that of the fundamental

click for      f0         f1        f0+f1

musical tones are combinations of pure tones (cf. Barlow & Mollon 1982): fundamental (determines pitch) + harmonic frequencies (determine timbre)
>> click for piano or guitar
more complex sounds can be genrated by adding further frequencies: chords, consonance, dissonance, vowels, symphonies, etc.

see also: combination tones


the nature of noise

what happens when you superimpose random tones (i.e. produce several waves at the same time) ?

click here

(like tossing a handful of pebbles into the pond, or raindrops falling on the water surface, see Pierce 1983)

-- this relationship between waves is called interference --

superimposed waveforms generating random patterns make it difficult to separate the individual contributions

white noise is the superposition of many tones with random amplitude and frequency (interfering ripples on the pond) - think of the surface structure of the sea


acoustic sensory organ : the ear

the ear works as a transducer, converting sound waves into neural signals


the ossicles in the middle ear (hammer = malleus, anvil = incus, stirrup = stapes) convert oscillations from gas to liquid medium ('impedance' change)

the ear is a sensory organ with extreme sensitivity : absolute threshold corresponds to sound levels that generate eardrum vibrations smaller than 10-10 m (0.1 nanometer)

encoding of acoustic signals requires several 'engineering tasks', which are accomplished by astonishing biological solutions:


the sensory surface: inner ear

mechanical stimulation is transmitted through ossicles onto oval window - here the osciallations are converted to pressure waves in the cochlea - this generates a travelling wave on the basilar membrane (which resonates like the string of a guitar)

 
the organ of Corti in the cochlea picks up the vibrations from the basilar membrane by means of hair cells: mechanosensor array



from peripheral filtering to tonotopic maps

an animation of middle and inner ear motion,
<<source unknown>>

to see (and hear) the inner ear in action, go to http://www.hhmi.org/biointeractive/
neuroscience/cochlea.html

encoding with travelling waves :


the travelling wave reaches its maximum on the basilar membrane at a distance from the oval window that is characteristic for a given frequency
(from top to bottom: 100 Hz, 1,000Hz, 10,000Hz)

a given frequency stimulates a particular group of haircells at a given location of the basilar membrane
tone frequency is converted to location !!

mechanical excitation > travelling wave > electrical activity

further frequency tuning is achieved by lateral inhibition in the auditory pathway (cochlear nerve)

after G. v. Békésy, 1967


 fundamental mechanisms:


frequency channels revealed by masking

masking: you only can hear the piccolo if the bassoon is played very softly

   

based on this observation, we can design a psychophysical masking experiment :


close mask

distant mask

  • detect a (target) tone in presence of another (masking) tone 
  • vary intensity of masking tone until target disappears/reappears: determine detection threshold
  • vary frequency of masking tone to determine how thresholds change with frequency difference 

systematic variation of frequency of masking tone to determine a set of detection thresholds for a given target frequency (cf. Barlow & Mollon 1982) is a key method to determine the auditory 'tuning curve' for this target frequency: threshold amplitude as function of mask frequency
the 'bandwidth' of the frequency filter that is detecting the target tone is given by the frequency differency at halfheight of threshold function


tuning curve
  • low thresholds (mask amplitudes) for masks close to the target frequency
  • high thresholds (mas amplitudes) for masks more distant from target frequency
  • width of the U-shaped threshold curve corresponds to the bandwidth of the frequency filter responsible for the target 

this filter tuning is the basis for the perception of pitch !


frequency tuning: basis of pitch perception

frequency tuning can be measured in systematic masking experiments for different target tones …
many filters cover full range of frequencies – like digital audio systems!

tuning curves resulting from frequency filter mechanisms can be measured in systematic masking experiments :

three tuning curves for three different target tones

after Barlow & Mollon 1982


electrophysiological measurements from the auditory pathway (e.g. cochlear nerve of cat, sensitivity as function of frequency) generate very similar patterns of frequency tuning for individual neurones: preferred tones
many filters cover the whole audible frequency range

this filter tuning is the basis for pitch discrimination involved in auditory perception, such as recognising musical tunes!
the same principle of encoding frequency components separately  is used in digital audio systems !!


how to measure loudness ...

the pressure of airwaves determines the magnitude of auditory sensation: 'loudness'

when sound intensity = sound pressure level (SPL) is decreased or increased relative to a reference tone, subjective ratings of loudness change proportionally

(after Gulick et al. 1989)

sound intensity (SPL) is measured in decibels (dB), 20 * log (I1/I0) : as multiples of 20 micropascals
(I0 ~ hearing threshold at 1000 Hz)

perceived loudness can be measured quantitatively by comparing two successively presented tones (of different frequency)
and deciding which one sounded louder (forced choice FC to find threshold for louder/softer)

   

intensity of comparison tone is adjusted until it has the same ‘subjective’ loudness as the reference tone
and then the physical intensity (sound pressure level SPL) is recorded as ‘perceived loudness’


sound intensity: perception of loudness

Audiogram (or audibility function): describes the hearing performance of an individual (Berrien 1946)

by comparing tones of many frequencies we can derive curves of equal loudness

  • audibility function (AF) : detection threshold as function of frequency (given in Hz) 20 ... 20,000 Hz
  • equal loudness contours are determined by matching the subjective intensity of tone pairs at various base intensities (1000 Hz reference: close to 0 dB at threshold)

speech only covers a small region of the auditory response range: 300 - 5000 Hz, 40 - 70 dB

remember: sound intensity (SPL) is measured in decibels:
    20 * log (I1/I0)
as multiples of hearing threshold I0 at 1000 Hz, in logarithmic units:
increase by a factor of 10 means additional 20dB
(for example: 20db = 10 times louder, 40db = 100 times louder, 60 db = 1000 times louder, etc.) 


the ecology of sound intensity

 
0 dB
threshold of hearing (at 1000 Hz)  
 
10 dB
normal breathing  
10 x
20 dB
leaves rustling in a breeze  
 
30 dB
empty lecture theatre  
100 x
40 dB
Holloway campus at night (without planes)  
 
50 dB
quiet restaurant  
1,000 x
60 dB
two-person conversation  
 
70 dB
Trafalgar Square  
10,000 x
80 dB
vacuum cleaner  
 
90 dB
huge waterfall (Niagara) danger level
100,000
100 dB
Underground train  
1,000,000
120 dB
Propeller plane at takeoff hearing loss
10,000,000
140 dB
Heathrow: jet at takeoff (low flying Concorde) pain level

for more details, click here


spectrogram

auditory events can be complicated patterns of frequency and intensity (this is called a ‘spectrum’) which are modulated as function of time

to display and analyse such events, scientists use ‘spectrograms’, or ‘sonograms’: frequency composition is shown graphically as it changes in time

the three musical tones of the chord can be seen in the schematic spectrogram as succession of different fundamental & harmonic frequency clusters

the complexity of a spoken word

each spoken word generates a complex pattern of frequency and intensity (spectrum), which is modulated as function of time

example; the phrase 'enjoy your weekend' is recorded as

  • sonogram (or spectrogram: composition of different frequencies indicated by greylevels as function of time)
  • waveform envelope (measured as vibrations of microphone membrane) 

… imagine how difficult it is to recognize the same word form different speakers, or the voice of a particular speaker ...


the spectrum of human speech

speech sounds cover a wide range of the audible spectrum

  • vowel sounds are mainly in the lower frequency region
  • consonants cover almost the entire range

telephone systems cut off the upper part of the spectrum with minimal effects on speech recognition !!!

hearing loss

the are many different kinds of auditory impairments, apart from complete deafness

  • presbycusis: selective high-frequency hearing loss with age
  • noise exposure can lead to temporary threshold shifts (auditory fatigue) and permanent (partial) deafness
  • tinnitus: continuous humming or ringing, aftre some time it leads to suppression of the affected frequencies

...  imagine the consequences of such impairments of the audibility function for communication and social life ...


auditory space

how does (perceptual) auditory space represent events in a four-dimensional world (3 spatial dimensions + time)?
in vision, we have 2D-images, can easily localise objects in the visual field and see several objects at the same time

     but can you hear things at the same time in different locations?

the ear is a 1D sensor (a microphone samples at one point in space) - so :


sound localization

there is no direct representation of auditory space : location needs to be calculated from a number of cues (see Goldstein 2002, chapter 11)
(in biology there are experts for this job, like barn owls: see Konishi 1986)


the cocktail party effect

it is easy (for young folk) to single out one particular voice from the background of a noisy pub,
or to pick up the tune of a single instrument from a large orchestra (this is called the 'cocktail party effect')

 
how can this be achieved?

the mixture of wavefronts hitting the ear has an overwhelming complexity !

(more like the ‘rippled’ surface of a pond in a storm, from Pierce 1983)

the core of the problem of the cocktail party effect (Cherry 1953) is masking :
the detection of a tone is impaired if another tone or noise is presented at the same time


auditory illusions

we can create illusions in the auditory system like in the visual system !
click here to hear the eternally rising tone

Shepard’s (1964) eternally rising tone:

an impossible acoustic object like Escher's (1961) 'Waterfall' ?

read more about how the tone is generated (and try it here)


summary: auditory perception


General Reading:

Specific References:


to download a pdf copy of lecture slides, click here


back to course outline
last update 8-03-2014
Johannes M. Zanker