Home page / Kotisivu
Last updated: 11.10.2001

 

Mietta's Labeling Page

A few general principles...
Labeling tools
Levels of transcription
Worldbet
Worldbet symbols

About this page

I am a phonetician who is interested in Finnish spontaneous speech. For my work, I need to collect large amounts of conversational speech data of a high acoustic quality, and for the most part, the material has to be segmented and transcribed manually. So far, I have been doing a lot of labeling work myself. Labeling is also quite error-prone, and it is easy to "forget" or change your own segmentation principles. People tend to disagree on segment boundary placement and the phonetic labels for segments, and yet they are not very consistent themselves. This page is to provide documentation of the principles that I have been using (or those that I think I have been using). Primarily, it is a notebook for myself, but I hope it will also help others to learn the exhausting secrets of labeling work... Many of the principles I state below have been either inspired by or jointly agreed upon with my friend Nina Alarotu, who undoubtedly still is the number one speech labeler in Finland today, considering the amount of labeling hours she spent with the Finnish speech database! The symbol lists are based on the Worldbet references that are available on the web, and I have only made slight modifications in order to have a useful and easy-to-remember symbol set for my purposes.

This page is constantly under construction: I will elaborate on the different aspects and problems of transcription as I go along. Please feel free to comment!

Mietta Lennes
mietta.lennes@helsinki.fi

 

A few general principles in speech segmentation and phonetic transcription


  1. Forget the writing rules!

  2. If you cannot find as many boundaries as you expect, or if there are more qualitative turning points than the number of boundaries you expect, it is better to change the number of boundaries and select the best symbol that comes to mind, than to put a boundary in an arbitrary place and to use a bad-fitting symbol.

  3. When selecting labels, think of the position of the vocal tract and where the articulators are moving. (Phonetically speaking, Finnish does have, e.g., voiced alveolar and postalveolar fricatives and velar approximants!)

Suggest more!

 

Labeling tools

I use Praat for the segmentation and labeling of my sound files.

Praat has the possibility of using very long sound files, whose size can be as big as 2 gigabytes. These sounds can be used as LongSound objects, meaning that the sound file is not fully loaded into memory, but it can be viewed and accessed with a corresponding TextGrid object. TextGrids are nothing but text files, so you can use long conversational speech files from CD-ROMs or other read-only media and only edit and save the small TextGrid files each time when labeling the speech file. You can easily share the TextGrids with others, and each person can modify the TextGrid for herself, as long as the sound file is not changed.

Top of this page


Levels of transcription

In Praat, you can use many IntervalTiers for different levels of transcription. I have tried to utilize this possibility to the full, since many different levels allow many kinds of searching criteria later on. Here, I discuss those levels which I think are useful. I also point out some segmentation and transcription problems and give my solutions to them (for the time being; I may change my mind...)

Phone level
Phoneme level
Syllable level
Word level
Accent level
Intonation Phrase (IP) level
Utterance level
Nonspeech level
Topic level
"Ready?" level

 

  1. Phone level: My transcription method is rather phonetic: I try to find the closest possible Worldbet symbol for each speech sound I find in the signal, adding all necessary diacritics. I determine the phone boundaries and labels in an acoustic-phonetic basis: partly by inspecting spectrograms, intensity curves, etc., and partly by listening to shorter and longer segments of speech. A phonetic transcription is supposed to describe what is actually produced, so I make as little reference to phonemes or "what-there-should-be" as possible.

  2. Phoneme level: This level I do not use, because it has turned out to be too difficult and frustrating to look for the corresponding speech segments for a Finnish phoneme string. The phonemic structure of a word of casual Finnish speech cannot always be defined in a straightforward way, since the corresponding written word forms may not exist. Moreover, we do have an orthographic transcription of the utterances, and it is quite simple to convert the orthographic symbol strings into phoneme-like strings, since Finnish orthography is said to follow the phonemic structure rather closely. However, I am working on a Praat script that would automatically convert my narrow phonetic segmentation and labeling to a "quasi-phonemic" segmentation and labeling for situations where this representation might be needed.

  3. Syllable level: I have attempted to mark all syllable boundaries. This should be done after segmenting and labeling the phonetic level, perhaps at the same time as marking word boundaries. The problems with this level are similar to those with word boundaries: In Finnish spontaneous speech, the definition of a syllable and its boundaries often relies completely on the intuition of the language speaker. Syllable boundaries in the middle of geminates are marked as geminates at word boundaries.
    A boundary in the syllable level does not have to coincide with a boundary in the phonetic level, but word boundaries have to coincide with syllable boundaries.

  4. Word level: I have attempted to mark all word boundaries.This should be done after segmenting and labeling the phonetic level. A boundary in the word level does not have to coincide with a boundary in the phonetic level, but it has to coincide with a syllable boundary.
    Problems:
  5. Accent level: The boundaries in this level should be identical to word level boundaries. I have marked with an 'x' all words that to me are clearly prominent (bear a sentence accent). As in any perceptual prominence judgments, my decisions are not the ultimate truth, but I consider it useful to have even a preliminary and subjective indication of accentuation. At this point, no distinction is made between different sentence accent types etc. Several or none accented words are allowed within one utterance.
    Thus, (sentence) accent is now marked roughly as a property of a whole word, and no indication about the more specific domain of accent is given. In Finnish, there is no lexical stress (lexical items are generally not distinguished with regard to accent placement), but in case a word receives a (sentence) accent, it is usually the first syllable that is perceived as most prominent. So, although the real situation is more complicated than that, we may assume that the first syllable of an accented word is the prominent one.

  6. Intonation Phrase level (IP): This level is not very well developed. I have just been trying to cut utterances into smaller phrases which would be somewhat coherent prosodically. No fine definitions so far.

  7. Utterance level: An utterance is a stretch of speech by a single speaker between two pauses. A pause is any period of silence where the speaker is not articulating a speech sound. The labels of utterance level intervals are the orthographic transcriptions of the utterances. Boundaries must coincide with intonation phrase boundaries.

  8. Nonspeech level: This level describes long-term "non-speech" properties which may overlap with speech sound production: breathing (in or out; ingressive speech can occur for short utterances in Finnish), laughter; or external sounds, like different background noises. Those phenomena that cannot overlap with speech articulations can be marked in the utterance level or phone level (the Worldbet symbols for these features always begin with a dot '.', so they can be distinguished from phones).
    Actually, I have not started to use this level yet, but I plan to...

  9. Topic level: A rather loosely defined tier, where I mark any big changes in the discourse topic, e.g., "Grandma's new house", "New job", etc. I mark the topics in English, to make the TextGrid file more legible to the international audience.

  10. "Ready?" level: I label with 'ok' all the stretches of the sound file which have been segmented, labeled and checked to the full. This way, when running analysis scripts, I can only select good and checked data.

Top of this page


Worldbet

For the phonetic transcription of speech, I use Worldbet, which is an ASCII version of the International Phonetic Alphabet (IPA). There are a few basic Worldbet documentation sources on the web (please let me know if you spotted better ones):

However, these sources are not quite consistent with regard to symbol definitions, and I have had to make a few slight modifications and additions. Below, I will try to explain each Worldbet symbol and convention that I use for Finnish. I will also refer to the corresponding IPA symbols.

In Worldbet, the base symbol is one ASCII character or sometimes a combination of two characters (see Worldbet symbols). Additional features or feature changes are indicated with diacritics, in the same manner as in IPA. Diacritics are separated from the base symbol with an underscore '_'. I have not avoided using diacritics whenever necessary. In case I need to give several diacritics for a sound, I have decided to add an underscore in front of each diacritic. This way, it is easy to refer to diacritics that have been given in a different order from within a script or a program.

It seems to me that a certain impression of vowel quality (e.g., if a vowel is centralized, or when indicating the voiced-voiceless distinction for consonants) can often be indicated with two or even more different symbol combinations (a base symbol, or base symbol plus diacritics). I often prefer to use a common and simple base symbol and to add more diacritics. Since we are not making specific claims about phonemic structure, and thus not "deriving" the phones from phonemes, it does not matter which symbol is used as a base, as long as the diacritics and the base symbol combined will imply the same set of perceived phonetic features. (Moreover, different transcribers will use slightly different symbols anyway!)

The list of Worldbet symbols

Top of this page


Worldbet symbols

as used for the transcription of spontaneous Finnish

General information about Worldbet and my transcription practices

Vowels
Consonants
Diacritics
Non-speech and other sounds

A couple of my old maps of Worldbet symbols:
Consonants
Vowels and diacritics

Top of this page

Vowels

The symbols I have more or less systematically used for Finnish are marked in bold.
(Unfortunately, the corresponding IPA symbols are still missing. I'm working on it...)

IPA Worldbet Description
  i front high unrounded
  y front high rounded
  I front high unrounded centralized
  Y front high rounded centralized
  e front mid-high unrounded
  7 front mid-high rounded
  E front mid-low unrounded
  8 front mid-low rounded
  @ front low unrounded, between mid-low and low
(the Finnish /ä/ as reference)
  a front low unrounded
  6 front low rounded
  ix (the Russian /i/ as reference)
  ux (the Finland-Swedish /u/ as reference)
  & schwa = central vowel
  ox, &_w rounded schwa = rounded central vowel
  3 central mid-low unrounded
  ax central low/mid-low unrounded
  4 back high unrounded
  u back high rounded
  U, u_x back high rounded centralized
  2 back mid-high unrounded
  o back mid-high rounded
    back mid-low unrounded
  > back mid-low rounded
A back low unrounded (the Finnish /a/ as reference)
  5 back low rounded
   
Other Worldbet symbols

Top of this page

Consonants

Stops
Nasals
Fricatives
Affricates
Approximants
Laterals
Flaps/ taps
Trills
Ejectives
Implosives
Clicks

Top of this page

Stops

The symbols I have more or less systematically used for Finnish are marked in bold.
(Unfortunately, the corresponding IPA symbols are still missing. I'm working on it...)

IPA Worldbet Description
  p voiceless bilabial stop
  t[, t_[ voiceless dental stop
  t voiceless (apico)alveolar stop
  tl voiceless lateral alveolar stop
  tr voiceless retroflex stop
  tn, t_n voiceless nasalized stop
  c voiceless palatal stop
  cp voiceless labial palatal stop
  k voiceless velar stop
  kp voiceless labial velar stop
q voiceless uvular stop
  ? glottal stop
  b voiced bilabial stop
  d[, d_[ voiced dental stop
  d voiced (apico)alveolar stop
  dl voiced lateral alveolar stop
  dr voiced retroflex stop
  dn, d_n voiced nasalized stop
  J voiced palatal stop
  Jb voiced labial palatal stop
  g voiced velar stop
  gb voiced labial velar stop
  Q voiced uvular stop
  ph voiceless bilabial aspirated stop
  t[h voiceless dental aspirated stop
  th voiceless (apico)alveolar aspirated stop
  ch voiceless palatal aspirated stop
  kh voiceless velar aspirated stop
  qh voiceless uvular aspirated stop
  bh voiced bilabial aspirated stop
  d[h voiced dental aspirated stop
  dh voiced (apico)alveolar aspirated stop
  Jh voiced palatal aspirated stop
  gh voiced velar aspirated stop
  Qh voiced uvular aspirated stop
  pH voiceless bilabial hyperaspirated stop
  tH voiceless (apico)alveolar hyperaspirated stop
  tR voiceless retroflex hyperaspirated stop
  tN voiceless nasalized hyperaspirated stop
  kH voiceless velar hyperaspirated stop
  bH voiced bilabial hyperaspirated stop
  dH voiced (apico)alveolar hyperaspirated stop
  dR voiced retroflex hyperaspirated stop
  dN voiced (nasalized) hyperaspirated stop
  gH voiced velar hyperaspirated stop
   
Other Worldbet symbols

Top of this page

Nasals

The symbols I have more or less systematically used for Finnish are marked in bold.
(Unfortunately, the corresponding IPA symbols are still missing. I'm working on it...)

IPA Worldbet Description
  m bilabial nasal
  M labiodental nasal
  n[ dental nasal
  n (apico)alveolar nasal
  nl lateral alveolar nasal
  nr retroflex nasal
  nj palatal nasal
  N velar nasal
  nm labial velar nasal
  Nq uvular nasal
   
Other Worldbet symbols

Top of this page

Fricatives

The symbols I have more or less systematically used for Finnish are marked in bold.
(Unfortunately, the corresponding IPA symbols are still missing. I'm working on it...)

IPA Worldbet Description
  F voiceless bilabial fricative
  f voiceless labiodental fricative
  T voiceless dental fricative
  s voiceless (apico)alveolar fricative
  s{ voiceless laminoalveolar fricative
  hl voiceless lateral alveolar fricative
  sr voiceless retroflex fricative
  S voiceless postalveolar fricative
  S{ voiceless laminopalatoalveolar fricative
  C voiceless palatal fricative
  x voiceless velar fricative
  W voiceless labial velar fricative
  X voiceless uvular fricative
  HH voiceless pharyngeal fricative
  h voiceless glottal fricative
  H voiceless epiglottal fricative
   
...under construction...sorry...
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
   
Other Worldbet symbols

Top of this page

Affricates

Under construction...

The symbols I have more or less systematically used for Finnish are marked in bold.
(Unfortunately, the corresponding IPA symbols are still missing. I'm working on it...)

Top of this page

Approximants

Under construction...

Top of this page

Laterals

Under construction...

Top of this page

Flaps / taps

Under construction...

Top of this page

Trills

Under construction...

Top of this page

Ejectives

Under construction...

Top of this page

Implosives

Under construction...

Top of this page

Clicks

Under construction...

Top of this page

Diacritics

IPA Worldbet Description
  _0 voiceless
  _v voiced
  _h aspirated
   
...under construction...sorry...
     
     
     
     
     
     

 

Top of this page

Non-speech and other sounds

Worldbet Description
.fp filled pause
.tcl, .tc tongue click
.ls lip smack
.br breath noise (.bri = breath in, .bro = breath out)
.glot glottalization
.vs squeak, voice crack
.laugh laugh
.ct clear throat
.cough cough
.sniff sniff
.sneeze sneeze
.yawn yawn
.burp burp
.uu unintelligible speech
.ns human, not speech
.bn background noise
.ln line noise

 

Top of this page

Home page / Kotisivu
11.10.2001 mietta.lennes@helsinki.fi