Helsinki Corpus of British English Dialects
The Helsinki Corpus of British English Dialects (HD) is a collection of orthographically
transcribed audio recorded speech, mainly from East Anglia and the South-West, with a minor collection from
Lancashire. The recordings were made in the 1970s and the 1980s by Finnish postgraduates.
The aim of the corpus is to provide material for linguistic research in the fields of dialectology, sociolinguistics, discourse analysis,
morphology, syntax and phonology. The corpus also provides material for non-linguistic, multidisciplinary research, such as ethnography of
communication, local habits and history.
Project leaders: Ossi Ihalainen 1984–1993,
Kirsti Peitsara 1997–2006, Anna-Liisa Vasko 2007 onwards.
Fieldwork supervisors: In the 1970s, Tauno F. Mustanoja (Helsinki) in cooperation with Harold Orton (Leeds).
Ossi Ihalainen supervised the fieldwork done in the late 1980s.
Size: Altogether 187 files consisting of 1.008.641 words.
Time periods: 1970s and 1980s (CAM, DEV, ELY, SOM, SUF), 1980s (ESS, LAN).
Status: First stage completed in 2006, with corpus material available for research. Second stage is ongoing,
with plans of extending the corpus material with previously unpublished corpus data.
Corpus data: The primary data are the audio recordings of spoken dialect speech.
Funding: Finnish Cultural Foundation; The Academy of Finland, The University of Helsinki.
Language: English, rural (Cambridgeshire, Devon, Isle of Ely, Somerset, Suffolk) and urban (Essex, Lancashire).
Reference line and Copyright
The Helsinki Corpus of British English Dialects (2006). Department of Modern Languages, University
of Helsinki. All the material consists of interviews made by the fieldworkers mentioned below who have
full copyright for the material. For permission to use the files, contact Anna-Liisa Vasko
(firstname.lastname@example.org) or Kirsti Peitsara (email@example.com).
Kirsti Peitsara and Anna-Liisa Vasko.
Cambridgeshire proper (CAM) - Anna-Liisa Ojanen (Vasko)
Devon (DEV) - Ossi Stigell
Essex & Lancashire (ESS, LAN) - Riitta Kerman
Isle of Ely (ELY) - Irmeli Tammivaara-Balaam
Somerset (SOM) - Ossi Ihalainen
Suffolk (SUF) - Leena Pasanen
Graduate and postgraduate research assistants
Maarit Alanko, Tuula Chezek, Sanna Huttunen, Minna Korhonen, Jaana Suviniitty, Eero Timoskainen.
Peitsara, Kirsti. Manual to the Dialectal Part of the Helsinki Corpus of British English Dialects.
Available for those working in VARIENG.
The coding system is based on the set of ASCII codes (96 printable characters). The names of the
187 files follow MS-DOS conventions, limiting available characters to eight. Each file name
begins with the letters DI (for dialect) plus three first letters of the county the samples are
taken from followed by an ordinal number (e.g. DIDEV01, DISOM28, DIELY50).
On-site access; available (for people working in VARIENG) in WordCruncher. For permission to use the
files, contact Anna-Liisa Vasko (firstname.lastname@example.org) or Kirsti Peitsara (email@example.com).
CoRD Entry submitted on March 14, 2008 by Dr. Anna-Liisa Vasko and Simo Ahava, Department of English, University of Helsinki.