Mietta Lennes

Comparison of acoustic, psychoacoustic, and psychophysiological distances of Finnish vowels

Master's thesis, University of Helsinki


Speech perception may be considered as a special case of our ability to recognize patterns. The allophones of phonemes are in speech realized as various sounds, which are often very difficult to distinguish in the segmental level. In order to perceive speech, individuals probably possess some kind of memory traces of both the sounds and the higher level phonetic units of their native language in addition to knowledge of the linguistic system. These memory representations can be seen as patterns exhibiting the general dimensions of their objects.

Auditory perception is modified by the structural and physiological properties of the ear, the cochlear nerve and the brain. Therefore, auditory physiology and the results of psychoacoustics must be taken into account when considering the perceptual distances of vowels. The distance measure depends on the selected auditory model. The distances of the vowel stimuli used in the present study were measured as Euclidean distances of whole FFT and auditory spectra. The distances obtained from the two spectral representations were slightly different.

The mismatch negativity (MMN) is a negative deflection in the auditory event-related potential (ERP) of the brain. It is elicited when a deviant sound stimulus occurs in a sequence of repetitive standard sound stimuli. The hypothesis in this study was that the MMN amplitude is directly proportional to the psychoacoustic distance between vowel stimuli and that also the prototypicality of the stimuli has an effect on the MMN amplitude. A prototype vowel is here considered as a typical example of a specific speech sound either in production or in perception.

There were 18 vowel stimuli in the experiments: eight stimuli represented prototypical vowels in Finnish and ten were non-prototypical ``intermediates'' of the prototypes. In the ERP measurements, four Finnish subjects were presented with sequences of ten different stimulus combinations. The standard stimulus was always a prototype vowel. There were three deviants in each condition: two prototypes, and their non-prototypical intermediate vowel. Each deviant occurred with a 5 \% probability. ERPs were recorded and MMN amplitudes were measured from the brain of each individual subject. All deviant stimuli in all ten conditions elicited measurable MMNs in each subject.

In the identification experiment, the same four subjects were asked to identify each of the 18 vowel stimuli as one of the Finnish vowels or to indicate that the stimulus could not be identified. The subjects were also asked to assess the prototypicality of the identified vowel stimuli. There was considerable interindividual variation in identifications, although the stimuli that had been intended to be prototypical usually gained higher scores than their intermediate stimuli.

It was found that the subjective degree of prototypicality did not significantly affect MMN amplitudes. The MMN amplitude was directly proportional to the Euclidean distances of both the acoustic and the psychoacoustic spectra of the vowel stimuli.

Top of this page

Publications - Julkaisut

Home page / Kotisivu