AISV/STELARIS Summer School 2011

(Hands-on) Course material - Mietta Lennes

All the lectures will include some examples and demonstrations and a chance to try things out on your own computer!


Lecture 1: Annotation and labeling

Try the script save_conversation_tiers_as_text_file.praat on the TextGrid file F1_F2_excerpt.TextGrid. (See instructions)


Lecture 2: How to speed up your annotation project using Praat scripts

Scripts:

  1. Insert potential utterance/pause boundaries automatically (by detecting silent intervals according to intensity): mark_pauses.praat
  2. Make sure you have a text file called labels.txt in the same directory with the next two scripts (you can make one up if you don't have it). The lines in this text file will be inserted as labels to the intervals in the (topmost) tier in the TextGrid.
  3. Insert labels for written sentences (in the case of read-aloud text): label_sentences_from_text_file.praat
  4. Insert labels for the actual utterances (make a copy of the written sentence tier and rename it as "utterance") label_utterances_from_text_file.praat
  5. Add word and syllable tiers according to the transcript in the utterance tier (the resulting syllables will only make sense in Finnish...): add_syllable_and_word_tiers.praat
  6. Add an initial phone tier (works for Finnish): generate_phone_tier_from_words_and_syllables.praat

    Some instructions for using the scripts mentioned above


Lecture 3: Descriptive systems (defining the annotation units; principles and pitfalls)

Get the annotation status of your corpus: annotation_status.praat

Instructions: Save the script, put your sound and TextGrid files into a subdirectory corpus/, run the script and see. The script produces a text file called annotation_status.txt that should provide you with a summary of the annotation tiers and the amount of labeled intervals or points in them.

A script for marking the prominence of syllables in a sound file: mark_prominence.praat
You need a sound file that has been annotated with utterance, word and syllable tiers. The script will insert a point tier with a point at the mid point of each syllable, the user will see and hear one utterance at a time and he/she can judge which syllables are prominent (you could use, e.g., an ordinal scale from 0/empty=not prominent;1;2=most prominent). You are allowed to continue working with the same sound file in several sessions.
NB: This is not a real experimental setup.


Lecture 4: Exploiting the annotation (analysing your speech corpus with Praat)


Exercises and downloads for the course


Annotation tools

Other tools for you to try out (these will not be further discussed during the lectures):


Links and additional information

Metadata for speech corpora

If you wish, you can take a look at the webpage for the IPR and Metadata Workshop that is held in Helsinki this week. The page contains an e-form for collecting a set of the most important metadata elements for speech corpora (find the link to "metadata for audio corpora"). However, please do not press the Submit button in the e-form, because it might confuse the workshop organizers ;-)

Speech Corpus Toolkit for Praat - Mietta's Praat script site


References

mentioned during the lectures