The consonantal element (th) in some Late Middle English Yorkshire texts

Vibeke Jensen
Volda University College


This paper attempts to examine the distribution of variant spellings for the consonantal element (th) within a corpus of 43 Late Middle English Yorkshire texts; 21 religious prose texts and 22 legal documents. The consonantal element corresponding to PDE /θ, ð/, here defined as the variable (th), shows much variation in Middle English. It also shows specific developments in the northern area, which raise interesting questions about orthographic practices and sound-to-spelling relationships. The variant spellings within each text have here been recorded, and an analysis of their chronological and geographical distribution has been carried out. The study is based on two types of data; a manual questionnaire survey of entire texts or large samples of them, and a subset of the Middle English Grammar Corpus (Stenroos et. al. 2011), which consists of transcriptions of 3,000-word samples of the same texts; all of the texts were localised in the Linguistic Atlas of Late Mediaeval English (McIntosh, Samuels and Benskin 1986). There is within the religious material no clearly discernible correlation between type of spelling and date, or between spelling and geographical localisation. For the documents, on the other hand, there is a slight increase of the <th> spelling over time. On the whole, the material shows strong evidence for a ‘Northern system’ that distinguishes between voiced and voiceless fricatives. At the same time, the material shows a clear distinction between grammatical words and lexical words, and variation between <y/þ> and <th> could therefore tentatively be interpreted as lexical rather than phonemic.

1. Introduction

This article examines the distribution of variant spellings for the consonantal element (th) within a corpus of 43 Late Middle English Yorkshire texts. The texts have been localised on linguistic grounds in the West Riding of Yorkshire and the City of York in the Linguistic Atlas of Late Mediaeval English (McIntosh, Samuels and Benskin 1986; henceforth LALME). They are dated to the late fourteenth and the fifteenth centuries.

The focus of this article is written variation, and the main aim is to map out and discuss the geographical and chronological distribution of orthographic variant forms corresponding to PDE <th> in medieval West Riding of Yorkshire and the City of York. The different spellings are here defined as realisations of the variable (th). The Middle English material shows much linguistic variation and the variable (th) shows specific developments in the northern area that raise interesting questions about orthographic practices and sound-to-spelling relationships. In Middle English, the northern area may be considered a fairly well-defined dialectal area in terms of the distributions of a large number of dialectal features, and the West Riding of Yorkshire forms a transitional area with regard to features that show marked north-south distributions.

This article is partly based on my PhD dissertation, Studies in the medieval dialect materials of the West Riding of Yorkshire (Jensen 2010), which is an investigation of the medieval dialect materials of the pre-1974 West Riding of Yorkshire and the City of York. The main aim of this thesis was to study the distribution of ‘northern’ and ‘non-northern’ dialectal forms, and to relate the patterns to the overall socio-historical development of the English language. The investigation was based on prose texts, including both religious prose and legal documents. This work formed directly part of the Middle English Grammar Project (MEG), a long-term research programme shared by the Universities of Stavanger and Glasgow.

The main focus of this article is orthographic variation, and the methodology employed is based on the principles first established in the compilation of LALME: namely that written language can, and perhaps should be studied in its own right. The aim is here to relate different orthographic forms of the variable (th) to external variables such as date and geographical localisation. The study makes use of two types of data: a manual questionnaire survey of entire texts or large samples of them, and a computer-based analysis of 3,000-word tranches of the same texts; a subset of the Middle English Grammar Corpus (henceforth MEG-C; Stenroos et al. 2011-, version 2011. 1.).

2. The variable (th)

The consonantal element (th) corresponds to the PDE dental fricatives /θ/ and /ð/. In Old English the dental fricative posed a challenge for the writing system: the Latin alphabet had no letter corresponding to this sound, and in Old English there developed three different spellings: <th>, <þ> and <ð>. In the earliest texts, <th> was the most common spelling; fairly soon, however, it was replaced by the runic symbol <þ> ‘thorn’ and the modified form of <d> that is often known as ‘eth’, <ð>. During the Old English period, <þ> and <ð> were used interchangeably to represent the dental fricative. In Old English, there was no phonemic distinction between the voiced and the voiceless realisations; these were allophones in complementary distribution, and there was thus no need for different spellings. As with the other Old English fricatives, the voiceless allophones appeared in initial and final position and the voiced ones in medial position; in addition, medial geminates were voiceless. It was not until the transition between Old to Middle English that the distinction between voiced and voiceless variants of the dental fricative ([ð] and [θ] respectively) became phonemic; this process is usually dated to the fourteenth century (Lass 1992). It should be noted, however, that the phonemic distinction has always been relatively marginal, with few minimal pairs (the pair ‘thy’ and ‘thigh’ is usually cited as the only one in PDE).

Whereas the letter <ð> had become obsolete by around 1300, the letter <þ > ‘thorn’ was retained in most written varieties of Middle English to represent the dental fricatives, both voiced and voiceless. However, from the fourteenth century, the spelling <th> gradually won ground and eventually replaced <þ> altogether.

The gradual loss of <þ> in favour of <th> in the English spelling system has traditionally been explained by the introduction of print. However, Stenroos (2006: 10) argues that the replacement of <þ> was a gradual process that had begun long before the advent of printing. She bases her argument on the following three points. Firstly, the use of <th> appears in the southern part of England already as early as ca 1300 and grows steadily more common through the fourteenth and fifteenth centuries. Secondly, Scragg (1974: 67) suggests that printing and scribal traditions may have been relatively isolated from each other until well into the sixteenth century. Thirdly, it is not completely obvious that the carrying over of thorn into printing would have posed an insurmountable problem: some types used in England did include thorn, and the letter ‘y’ was always available. 

During the Middle English period there are thus two main variants representing the variable (th), namely <þ> and <th>. An added complication is that, in many scribal hands, the realisations of <þ> and <y> came to be identical, usually resulting in a y-shaped letter. The development of this merger, which seems to have originated in Textura scripts, has been discussed in detail by Michael Benskin (1982: 14 ff). He shows that the merger has a northern geographical distribution in Middle English, something which was first noted by McIntosh (1974: 49): “the two graphemes had fallen together almost universally both in Scotland and over the greater part of the north of England well before the end of the fourteenth century.” According to Benskin (1992: 85–86) this orthographic change came about through changes in styles of script, and has nothing to do with phonology: “in northern as in southern dialects [ð] and [θ] remained.”

Northern orthographic systems thus tend to contain one grapheme less than the southern ones. At the same time, many northern writers seem to make a distinction with regard to (th) that is absent from southern usages. It appears that many northern writers distinguish between voiced and voiceless dental fricatives, something which was described by Benskin (1977: 506–507) in an important footnote:

There thus arises a system whereby (1) words like think,through, thousand are spelled th-, but (2) words like they, them, there are spelled þ- or y-. The use of þ (or y for þ) is hence phonetically conditioned in the orthographies of a great many scribes, an observation which seems to have eluded most scholars.

In northern Middle English there thus seems to arise a system in which the voiced fricative is spelled <þ>or <y>and the voiceless fricative is spelled <th>. This system has been discussed further by Stenroos (2004: 267) who suggests that it “may be said to contain two different graphemes, <th> and <þ>, which can be used to distinguish between e.g. thin ‘thin’ and þin ‘thine’.” It should be noted that the initial dental fricative is most often pronounced voiced in grammatical words and voiceless in lexical words; the distinction might thus be interpreted as lexical rather than phonemic. On the other hand, the grammatical word ‘through’, with voiceless realisation of the fricative, is often spelled with <th>.

While <th> came to be increasingly common in the developing standard form of written English, it did not replace <þ> straight away. Thus, Benskin (1992: 86) notes that northern writers who adopted the fifteenth-century written standard had to learn a new symbol: “The habitual <y> was not to be given up entirely, but where it corresponds to [ð] or [θ], it had to be replaced by an acquired <þ>.” Northern writers adopting the standard thus had to do two things. Firstly, they had to replace <y> with <þ> when it represented a consonant. Secondly, they had to get rid of the distinction in writing between voiced and voiceless dental fricatives. As standardisation is assumed to spread especially through the language of legal documents, it will be of particular interest to compare the West Riding religious prose material and the documentary texts with regard to this development. In the two following sections, though, the present sources and data will be presented.

3. Sources and methodology

3.1. The Yorkshire material

The material for the present study consists of 43 scribal texts, included in LALME and localised in the West Riding of Yorkshire or in the City of York. It includes texts belonging to two genres, documentary texts (22 texts) and religious prose (21 texts). The texts are listed in Appendix 1 according to the codes used in the Middle English Grammar Project. [1] These codes are based on the LALME LP codes, which have been made into four-digit numbers with prefixed L, adding zeros as necessary. These codes will be used to refer to the texts throughout the discussion.

The localisation of the texts is shown in Figures 1 and 2.

Figure 1

Figure 1. Localisation of the religious prose texts

Figure 1

Figure 2. Localisation of the documents

All except three (L1020, L1033 and L1248) of the 22 documents refer to a specific location. The attestation of typically non-northern forms may, however, suggest a tentative localisation of L1033 in the southern part of the West Riding (see Jensen 2010: 75–76). Of the 21 religious prose texts L0611b and L0614 were not placed on maps in LALME; however, analyses of the texts suggest a tentative localisation of L0611b in the southern extreme of the West Riding, and a localisation of L0614 in the north-western part of the area.

The texts represent a time span of a century or somewhat more: the precise scope is impossible to define as some of the datings for the texts are very imprecise. Of the 22 documentary texts 18 are explicitly dated, and the dates range from 1371–1497. Only two of the religious texts, L0070 (1432) and L0116 (1357) are explicitly dated. For the other texts it is only possible to work with approximate datings, but the range of the material is not more than 150 years (see Jensen 2010: 77). Precise dates are available for 21 of the texts; these are listed in Appendix 2. In addition to these there are nine texts dated by the quarter century, five by the half-century and eight by the century; these are listed in Appendix 3.

3.2. Two types of data collection

For the purpose of this study two kinds of data collection were employed: a manual questionnaire survey of entire texts or large samples, and a computer-based analysis of 3,000-word tranches of the same texts. The manual collection was carried out, originally as part of my PhD thesis, in order to understand the dynamics of each individual text, especially with regard to scribal behaviour and changes within the text. The questionnaire items, 84 in total, consist mainly of closed categories, but also a few open categories, including spellings for the variable (th). As part of the Middle English Grammar Project, electronic transcriptions were produced of all of the texts in the study, in 3,000-word samples or whole if shorter. All the transcriptions are available as part of the Middle English Grammar Corpus (Stenroos et al. 2011–, version 2011. 1.).

Searches through the transcribed texts were carried out using the concordancing programme AntConc, version 3.2.1w. (See the sources section for a link to download this programme as freeware). The text samples were used to retrieve actual forms to illustrate features of open-category questionnaire items; the item (th) was collected with the questionnaire as an open category, and only the spelling of the initial consonant string was noted on the work-sheet when completing the questionnaire. Here it has been useful to be able to search for actual forms in the text, both in order to cite examples and to check the lexical distribution of variants.

The two ways of collecting data have been complementary, and have allowed for the collection of a fuller set of data than would otherwise have been possible. The manual questionnaire survey has allowed for the examination of texts in their entirety, although the number of features for study could only be selective. Here, a concordance-based analysis of the transcribed sample has made it possible to recover data for features not covered by the questionnaire, if only for a limited sample. At the same time, the manual questionnaire analysis has provided the breadth and context that is crucial for making sense of a study based on an electronic corpus.

The two approaches have been useful for validating each other. It has been possible to use the electronic searches to check the frequencies derived at by manual data collection for the equivalent stretch of text. Such checking has only been possible for a small part of the manually collected material; the remaining data will not provide 100% accurate frequencies, as manual data collection will always involve a few slips; however, there is no reason to assume that such slips would affect the overall interpretation of the linguistic patterns. At the same time, the manually collected material has been used to check the representativeness of the samples, and it has been found that the sample size is sufficient to carry out studies of features of spelling and phonology.

4. The present data

As noted in section 3.2., initial spellings for the variable (th) were collected for my PhD thesis by means of a questionnaire. For the purpose of the present study I have collected spellings for this feature also from the electronic text samples; this collection has been carried out in order to compare the two sets of data. The additional collection has, however, only been required for the religious prose texts. The documentary texts are all relatively short and were transcribed in their entirety; this means that the electronic data necessarily is identical to the questionnaire data. The source texts for the electronic samples (the religious prose texts) are listed in Appendix 1.

The two sets of data have been arranged according to word type and text type. Firstly, grammatical words and lexical words are treated separately; in northern Middle English initial dental fricatives tend to be spelled differently in grammatical words and lexical words. The grammatical words include the items ‘this’, ‘that’, ‘these’, ‘those’, ‘though’, ‘they’, ‘their’ and ‘them’, as well as the item ‘through’. The lexical words were collected as an open category, and include all attested words with initial spellings representing the variable (th). Examples are: think, thoght and thynges.

Secondly, the data for the religious prose texts and the documents are treated separately. Linguistic variation in Middle English texts is most commonly studied in terms of geography, and regional patterns must be expected to account for much of the variation during the Late Middle English period. At the same time, variables other than geography must be assumed to have contributed to synchronic variation. The process of standardisation must here be taken into account. The developing standard was a variety of London English that first was attested in the Signet letters of Henry V. It is then assumed to have spread mainly through the language of legal documents.  The change from regional to standard usage was, however, by no means sudden, and the replacement of linguistic forms was slow and gradual. Also, northern dialects seem to have had very little in common with the developing standard, and in the North the adoption of the standard could have been very different from the southerly or Midland experience.

5. Distribution of spellings of the variable (th)

5.1. The religious prose texts

Table 1 shows the distribution of spellings in grammatical words and lexical words in the religious prose material.  The figures were collected manually with the questionnaire.

Text Date Grammatical words
Lexical words
    <þ/y> <th> <þ/y> <th> <þ/y> <th>
L0004 15a1 y 1385 th 4 - th 24 y 56 th 294
L0032 14b2 þ 2905 - - th 108 - th 260
L0070 1432 þ 2057 th 414 þ 356 th 42 þ 323 th 63
L0115 15a1 y 325 - y 10 th 2 y 53 th 12
L0116 1357 y 10 th 365 - th 19 - th 28
L0217 15b2 þ 903 th 32 þ 1 th 25 þ 30 th 96
L0234 14 y 435 - - th 9 - th 41
L0262 15a1 þ 3201 th 162 þ 2  th 133 þ 22 th 274
L0358 15ab y 2045 - - th 48 - th 226
L0406 15a y 4370 th 1 y 110 th 18 y 183 th 113
L0454 15a y 1173 - y 12 th 51 - th 152
L0473 15ab y 2055 - - th 71 - th 131
L0496a 15a2 þ 259 th 4 - th 9 - th 26
L0496b 15a2 þ 371 th 26 - th 23 þ 1 th 54
L0592 14b2 y 294 th 1 y 2 th 24 - th 34
L0597 15 þ 233 th 235 þ 2 th 7 þ 1 th 23
L0605 15 þ 86 - - th 3 - th 7
L0611b 15 y 3 th 15 - - - th 4
L0614 15 y 23 - - th 1 - th 1
L1002 15a1 þ 156 th 2 þ 2 th 3 - th 5
L1352 15 y 251 th 1 - th 5 - th 4
Total <þ> 10171 (43%)

<y> 12369 (52%)

<th> 1262 (5%)

<þ> 363 (32%)

<y> 134 (12%)

<th> 625 (56%)

<þ> 377 (15%)

<y> 292 (12%)

<th> 1848 (73%)

Table 1. The religious prose material: the distribution of spellings of the variable (th) collected manually

The corpus data were collected from 3,000-word electronically transcribed tranches of the same texts. The corpus material can thus be considered a sample of the questionnaire material, allowing for a comparison of the manual data and the electronic data. The figures for both sets of data are summarised in Table 2.

  Grammatical words

Through Lexical words
Questionnaire <y/þ> 22540 (95%) 497 (44%) 669 (27%)
<th> 1262 (5%) 625 (56%) 1848 (73%)
Corpus data <y/þ> 4318 (86%) 46 (32%) 31 (11%)
<th> 732 (14 %) 99 (68%) 262 (89%)

Table 2. The religious texts: two sets of data

As noted in 3.2., the collection of the two sets of data has made it possible to check the representativeness of the electronic samples. As Table 2 shows, however, there are some discrepancies between the two sets of data. For all three groups of words there is a larger proportion of <th> spellings in the electronic data than in the manually collected data. The differences are relatively big, and this may have to do with the different sizes of the two types of material. Whereas the corpus samples consist of approximately 3,000 words, the source texts from which they were transcribed are generally much longer and will necessarily yield a larger set of data. Furthermore, the religious texts are of very differing length, ranging from 2 folios to more than two hundred folios, and it may be fair to assume that long texts with distinctive usage will to some extent skew the figures. A certain level of divergence was therefore to be expected. Both sets of data nevertheless show the same trend, and the more or less parallel figures are likely to give a reasonably reliable picture of the overall variation within the material.  However, thus far only the overall figures from the samples have been collected, and the distribution of spellings within the individual text samples is yet to be mapped out. The following discussion will therefore be based on the manually collected questionnaire data.

For the grammatical words, the general tendency is a majority of <y> or <þ> spellings. Only13 of the religious prose texts show <th> in initial position; three of these, L0116, L0597 and L0611b, show dominant <th>. The localisation of these texts is shown in Figure 3. The triangles represent dominant <th>, while the circles represent texts in which <th> appears as a minority form. As the Figure shows, there seems to be no clear geographical pattern.

Figure 3

Figure 3. Distribution of religious prose texts with initial <th> in grammatical words

For the item ‘through’, the <th> type spelling is dominant or exclusive in all texts apart from L0070, which has been localised in the southern part of the area investigated and dated to the second quarter of the fifteenth century. Also for the lexical words the spelling <th> is very clearly dominant, and only three of the texts, L0070, L0115 and L0406, show a predominance of <y/þ> spellings. These three texts have been localised in different parts of the West Riding; L0070 belongs to the extreme south, L0115 to the southern half, and L0406 to the far north-west. For the religious texts, then, the distribution of spellings does not seem to show any clear geographical patterning.

Organising the religious material chronologically presents a problem. Considering the highly variable precision of the datings available, these texts cannot be placed into a single chronological sequence in any sensible way. An attempt was made to build up ‘fuzzy’ diagrams of chronology, but this turned out to be an extremely complex undertaking and did not seem to yield much in return. Evaluations of chronological developments within the religious material therefore need to be based on simple comparisons between groups of approximately dated texts.

Three of the four earliest texts, the fourteenth-century texts L0032, L0234 and L0592, show dominant or sole <y/þ> for grammatical words. Only one of the texts dated to the fourteenth century, L0116, shows dominant <th>. Also the latest text within the material, L0217, shows dominant <þ>. For the item ‘through’, the <th> type spelling appears as the sole variant in all fourteenth-century texts apart from L0592, which nevertheless shows <th> as the majority form. The <th> spelling is also clearly dominant in the latest text, L0217. For the lexical words, the four earliest texts all show sole <th>. In the latest texts, though, and perhaps surprisingly so, a relatively large number of the <þ> type spelling has been attested, although the <th> type appears as the dominant form. Only three texts, L0070, L0115 and L0406 show dominant <y/þ> for all three types of words; these are dated to 15a2, 15a1 and 15a respectively. A very clear chronological pattern thus cannot be established, even though there is a rather frequent occurrence of <y/þ> in the earliest texts, at least for the grammatical words.

In the religious prose material there is thus a very clear division between word types: the spelling <th> is dominant for lexical words and the item ‘through’, whereas spellings with <y> or <þ> are dominant for grammatical words. There are, however, no clearly discernible geographical or chronological patterns.

5.2. The documents

The distribution of spellings for the variable (th) in the documentary material is shown in Table 3.

Text Date Grammatical words Through Lexical words
    <þ/y> <th> <þ/y> <th> <þ/y> <th>
L0133 1412 <y> 13 14 - - - -
L0145 1428 <y> 138 10 - 1 - 7
L0348 1432 - th 9 - - - -
L0349 15 <y> 21 - - - - -
L0360 1478-79 <y> 19 - - - - -
L0363 1451 <y> 48 11 - - - 1
L0373 15 <y> 7 - - - - -
L0377a 1436 <y> 7 - - - - -
L0377b 1445 <y> 7 - - - - 1
L0378 1431 <y> 6 - - - - 1
L0415 1472-83 <y> 123 107 - - - 4
L0732 1451 <y> 8 - - - - -
L1001 1371 <y> 39 - - - - 1
L1020 1454 <y> 3 - - - - 1
L1033 1476-77 <y> 3 22 - - - -
L1102a 1471 <y> 18 5 - - - 1
L1102b 1474 <y> 12 18 - - - 2
L1128 1415 <y> 10 - - - - 1
L1228 1439 <y> 2 13 - - - -
L1245 1497 <y> 9 3 - - - -
L1248 15 <y> 8 3 - - - 3
L1348 1426-52 <þ> 91

<y> 1

- - - 3

þ/y 455 (68%)

th 215 (32%)

th 1 (100%)

th 23 (100%)

Table 3. The documents: the distribution of spellings of the variable (th)

In the documentary material, both <y> and <th> occur in grammatical words, whereas only <th> has been attested in lexical words. The item ‘through’ has been attested in one text only, L0145, where it appears with initial <th>. Only one documentary texts shows <þ>, namely L1348; it may be noted that this text stands somewhat out in the material, as it is an ecclesiastical document rather than a legal document and may thus be considered to belong to the religious rather than the legal register.

The chronological distribution of <y> and <th> in grammatical words, based on those documents that are clearly dated to a specific year or short time span, is shown in Figure 4. The columns represent percentages; the colour blue is used for spellings with <y>, whereas the colour red represents <th>. As the Figure demonstrates, the <th> spelling seems to grow somewhat more common over time.

Figure 4

Figure 4. Chronological distribution of <y> and <th> in grammatical words

The spelling unit <th> has been attested initially in grammatical words in ten documents: L0133, L0348, L0363, L0415, L1033, L1102a, L1102b, L1228, L1245 and L1248. Figure 5 shows the geographical distribution of these texts; triangles represent texts in which <th> appears as the sole or dominant spelling, whereas circles represent texts that show <th> as the minority spelling. Two of the texts, L1033, which shows dominant <th>, and L1248, in which <th> has been recorded as a minority form, have not been included in the Figure as they were not entered on maps in LALME. As the Figure shows, the texts are localised in a narrow belt stretching from the southern part of the West Riding to the north-eastern part of the area. None of the westernmost texts in the material show <th> and there thus seems to be an eastern distribution pattern.

Figure 5

Figure 5. Distribution of documents with initial <th> for grammatical words

On the whole, the documents show a much higher proportion of <th> spellings than the religious texts do. Spellings of the <y>/<þ> type are dominant for the grammatical words, but there is nevertheless a higher percentage of <th> than in the religious texts. In addition, only <th> spellings have been attested for the lexical words and the single occurrence of the item ‘through’.

6. Discussion and conclusion

It appears that the religious material does not show any clear discernible geographical pattern as regards variation between <th> and <y/þ>. The texts that show dominant <th> spellings belong to different parts of the West Riding, North and South, as do the texts that show dominant <y/þ>. For the lexical words, three texts show dominant <y/þ>, two of which belong to the southern area and one that has been localised in the northern area. This is not to say that one should dismiss regional variation altogether, and for the documents there appears to be an eastern distribution pattern, at least for texts in which <th> has been attested in grammatical words. It should, however, be noted that, unlike the religious texts, the documents form a geographically fairly coherent group. These texts are distributed in a broad belt across the central parts of the West Riding, and there are no documents associated with the north-western third of the county or with the easternmost corner. The distribution may therefore, tentatively, be assumed to reflect either demography or the accidental survival of texts (see Jensen 2010: 76).

There is no discernible correlation between spelling and date for the religious texts. For the documents, on the other hand, a diachronic pattern emerges as there is a slight overall increase of the spelling <th> over time. The documents also show a generally higher proportion of <th>. One could here consider the possible effects of standardisation, particularly when taking into account that standardisation is assumed to first have spread through the language of legal documents. At the end of the fourteenth century, the written language was local or regional dialect almost per definition; however, by the beginning of the sixteenth century, regional forms of written English had nearly disappeared. As the spelling <th> eventually spread with standardisation, there would be an expected correlation between <th> and overall standardised usage. This is, however, highly unlikely for several reasons. Firstly, most of the documents retain a distinction between <th> and <y/þ>, at least for the grammatical words. Secondly, and according to Benskin (1992: 88–90), there is no particular reason to think that standardisation proceeded regularly across the geography. Thirdly, it was found in the present writer’s PhD thesis that the West Riding documentary material shows little evidence of standardisation otherwise. The participle ending of the –ing type constitutes the only additional promising feature; forms of the –ing type as opposed to the –and type appear more commonly in the documents than in the prose texts. However, two forms hardly make a standard, and it might therefore be more sensible to assume that these forms simply reflect the usage of the southern West Riding area during this period, and perhaps represent a less conservative writing tradition than that of religious prose texts.

What is more, the material shows no evidence for the (re)introduction of <þ> with standardisation that was suggested by Benskin (see section 2). The only documentary text in the material in which <þ> has been attested is L1348, which is an ecclesiastical document; the texts belonging to the legal domain proper show no sign of <þ>.

One should not altogether dismiss the possibility that the distribution of spellings may be conditioned by more than one factors. On the whole, though, the material shows strong evidence for the ‘Northern system’ that distinguishes between voiced and voiceless fricatives. As noted in section 2, the voiced/voiceless distinction was allophonic in Old English, and it was not until the transition between Old and Middle English that the distinction became phonemic. As many northern Middle English writers make a distinction between voiced and voiceless fricatives, there seems to be grounds to relate the variation between <th> and <y/þ> to the phonological development that took place during this period. Within the West Riding material there is a marked tendency that the initial voiceless fricative in words such as think, through and thowgth (‘thought’) is represented in writing by <th>, whereas the voiced fricative in words such as e.g. þei (‘they’) and þa (‘those’) is represented by <y> or <þ>.

 At the same time, the material shows a clear distinction also between grammatical words and lexical words. In PDE, the variable (th) is generally pronounced voiced in grammatical words and voiceless in lexical words, and there is no reason to assume that this distinction was not present also during the Middle English period. The variation between <y/þ> and <th> could therefore be interpreted as lexical rather than phonemic. The one item within the material that does not fit this interpretation is ‘through’, which is the only grammatical word for which <th> is the dominant or sole spelling. On the other hand, and as noted in section 2, ‘through’ has a voiceless rather than a voiced realisation of the fricative.

Finally, there is also evidence for the merger of <y> and <þ>. It was noted in section 2 that in many northern texts <y> and <þ> are not distinguished, and both appear as a y-shaped letter. Northern orthographies thus tend to contain one grapheme less than southern orthographic systems. Within the West Riding material <y> and <þ> have merged in 32 out of 43 texts (20 documents and 12 religious texts). In these texts the spelling <y> corresponds to the dental fricative as well as to vocalic ‘y’ and the phoneme /j/, as for example in the words yngland ‘England’ and yorke ‘York’respectively. This merger sometimes led to confusion, and an illustrating example can be found in L1352 as shown in Figure 6 (MS Cambridge University Library Ee.iv.19, folio 91r, line 13):

for all brethir and sisters of our modir kyrke saynt petir of thork
(lines 12–13)

Here it is quite apparent that the scribe has wrongly replaced a vocalic ‘y’ with <th>, referring to thork instead of ‘York’. The merger, although not present within all of the 43 Yorkshire texts, is likely to have led to a few hypercorrections and scribal mix-ups, much like the MS CUL scribe and his saynt petir of thork.

Figure 6

Figure 6. MS Cambridge University Library Ee.iv.19, folio 91r


[1] The LALME LPs were sometimes based on what strictly consists of more than one scribal text. For the purpose of the present study, as well as for MEG-C, such scribal texts have been separated and analysed individually. This is the case with LALME LPs 377, 496 and 1102.


Link to download AntConc as a freeware programme:


Benskin, M. 1977. “Local archives and Middle English dialects”. Journal of the Society of Archivists 8: 500–514.

Benskin, M. 1982. “The Letters <þ> and <y> in later Middle English, and some related matters”. Journal of the Society of Archivists 1: 13–30.

Benskin, M. 1992. “Some new perspectives on the origins of standard written English”. Dialect and Standard Language in the English, Dutch, German and Norwegian Language Areas, ed. by J. A. van Leuvensteijn & J. B. Berns, 71–105. Amsterdam: Royal Netherlands Academy of Arts and Sciences, North-Holland.

Jensen, V. 2010. Studies in the Medieval Dialect Materials of the West Riding of Yorkshire. Ph.D. dissertation, University of Stavanger.

Lass, R. 1992. “Phonology and morphology”. The Cambridge History of the English Language, ed. by N. Blake, 1992, vol. II, 23–155. Cambridge: Cambridge University Press.

McIntosh, A. 1974. “Towards an inventory of Middle English scribes”. Neuphilologische Mitteilungen 75: 602–624. In Laing, M. (ed.) 1989. Middle English Dialectology: Essays on Some Principles and Problems, 46–63. Aberdeen: Aberdeen University Press.

McIntosh, A., Samuels, M.L. & Benskin, M. 1986. A Linguistic Atlas of Late Mediaeval English. Aberdeen: Aberdeen University Press.

Scragg, D. G. 1974. A History of English Spelling. Manchester: Manchester University Press.

Stenroos, M. 2004. “Regional dialects and spelling conventions in late Middle English: searches for (th) in the LALME data”. Methods and Data in English Historical Dialectology, ed. by M. Dossena & R. Lass. 2004, 257–285. Bern: Peter Lang.

Stenroos, M. 2006. “A Middle English mess of fricative spellings: reflections on thorn, yogh and their rivals”. To Make his Englissh Sweete upon His Tonge, ed. by M. Krygier & L. Sikorska, 9–35. Peter Lang.

Stenroos, M., Mäkinen, M., Horobin, S. & Smith, J. 2011. The Middle English Grammar Corpus, version 2011.1. University of Stavanger.

Appendix 1. The texts listed according to their MEG-C codes.

The religious prose texts and source texts for the electronic samples:
L0004 London, BL Harley 1022, hand B, fols 16r-73v. Rolle's Form of Living and additional short pieces
L0032 Oxford, Bodleian Hatton 12, hand A, fols 4r-208rA. Rolle's Commentary on the Psalter and additional short pieces
L0070 Manchester University, John Rylands Library: Lib Lat. MS 179, part II, hand B, fols 32r-163v. The Mirror
L0115 London, BL Harley 1022, hand A, fols 1v, 74r-81v. Benjamin Minor and additional short pieces
L0116 York, Borthwick Institute R.I.11 (Register of Archbishop Thoresby). Hand of Thomas de Aldefeld, fols 295r-297v. Lay Folks Catechism
L0217 London, BL Harley 2250, hand B, fols 88r-108r. Memoriale Credencium
L0234 London, BL Egerton 842, hand D, fols 245r-254v (end). Homily on the Ephesians
L0262 Oxford, Bodleian Laud Misc. 286, hand E, fols 36vB-125vB. Rolle's Psalter and a homily
L0358 Oxford, University College 28, main hand, fols 1r-118r. Hilton's Scale of Perfection and other tracts
L0406 Huntington Library, San Marino, HM 148, hand B, fols 23r-203v. Rolle's Commentary on the Psalter and additional short pieces
L0454 Huntington Library, San Marino, HM 148, hand A, fols 1r-22v. The Holy Boke Gratia Dei
L0473 Oxford, Bodleian Bodley 131, fols 1-121v. English and Latin religious pieces. Mirror of the Blessed Life of Jesus Christ and additional pieces
L0496a London, BL Harley 4172, English in one hand, fols 1r.11-15v. Instructions for a Parish Priest and God's Commandments
L0496b London, BL Harley 4172, English in one hand, fols 50v-63v. Articles of faith, Ordo visitandi, marriage service
L0592 Oxford, Bodleian Hatton 12, hand B, fols 208rA-212rB (end). Rolle's Commentary on the Psalter, Commentary on the Ten Commandments, the Apostle's Creed and additional short pieces
L0597 London, Dr Williams' Library: Anc 3., fols 133v-145v. Lavenham's Treatise on the Seven Deadly Sins
L0605 Lincoln Cathedral, Chapter Library 229 (B.6.7), hand of fols 124v.9-126v.17. General Sentence of Excommunication
L0611b London, BL Cotton Nero A iii., hand B, fols 135v-137v. Carthusian Form of Confession (not entered on maps in LALME)
L0614 London, BL Royal 17 a. xvi., hand of fols 28v-29v. Vision of Saint Thomas and Hymn to the Virgin (not entered on maps in LALME)
L1002 York Minster Chapter Library XVI M 4, hand A pp. 166-177. General Sentence of Excommunication.
L1352 Cambridge University Library Ee.iv.19., main hand, fols 85v-88v. General Sentence of Excommunication, etc
The legal documents:
L0133 London, BL Harley Charter 112.F.1. Will indented
L0145 York City Archives: York Memorandum Book A7/Y 225, fols 264v.36-267v.6. Memorandum
L0348 London, BL Add. Charter 16916. Codicil in English to Latin will
L0349 Yorkshire Archaeological Society, Leeds: DD 12/II/3/9/16. Arbitration
L0360 Leeds Central Library, Archives Department: TN/HX/A13. Affidavit
L0363 Yorkshire Archaeological Society, Leeds: DD 53/III/262. Indenture
L0373 Sheffield City Libraries: Bagshaw Collection 970. Memorandum of agreement
L0377a Huddersfield Central Library: WBD/VIII/10. Indenture
L0377b Huddersfield Central Library: WBM/2. Affidavit
L0378 Huddersfield Central Library: WBD/IX/7. Indenture
L0415 Hull University Library: DDLO 21/27, 21/28, 21/30, 21/32, 21/35, 21/40 (Selby Court Rolls). Presentments of the juries at the courts of the abbot of Selby
L0732 Yorkshire Archaeological Society, Leeds: DD 57/C/W.123. Arbitration
L1001 York Minster Chapter Library: Dean and Chapter H.1 (3), Chapter Acts 1352-1426, fols 100v-101r. Ordinacio Cementariorum, 1371
L1020 London, BL Add. 40011 (A), fol. 118r (olim fol. 126r). Entry in memorandum
L1033 Halifax, Calderdale Borough Archives: SH: 1/SH/1477. Declaration
L1102 Doncaster, Bentley Library DZ FL 1/1 and DZ FL 1/48. Indentures
L1128 Beverley, Humberside County Record Office: DDCS 44/1. Indenture
L1228 North Yorkshire County Record Office, Northallerton: ZFL 59. Document
L1245 Bradford Central Library: WPB 5/18. Indenture
L1248 Leeds, Yorkshire Archaeological Society: BEA/C3/B31/74. Letter
L1348 York, Borthwick Institute: R.I.19 (Register of Abp. Kempe), one hand, fols 332v-333v. Revocation, order and Confession

Appendix 2. Explicitly dated texts

1357 L0116 (14b1; religious prose)
1371 L1001
1412 L0133
1415 L1128
1428 L0145
1431 L0378
1432 L0070 (15a2; religious prose), L0348
1436 L0377a
1439 L1228
1445 L0377b
1451 L0363, L0732
1426-52 L1348 (15a2)
1454 L1020
1471 L1102a
1474 L1102b
1476-77 L1033
1478-79 L0360
1472-83 L0415
1497 L1245

Appendix 3. Approximately dated texts

14b2: L0032, L0592
15a1: L0004, L0115, L0262, L1002
15a2: L0496a, L0496b
15b2: L0217
15a: L0406, L0454, L0349 (document)
15ab: L0358, L0473
14: L0234
15: L0597, L0605, L0611b, L0614, L1352, L0373 (document), L1248 (document)