Basic structure

(Source: Manual to the Corpus of Scottish Correspondence / 2.2 Selection of data for the CSC.)

The present CSC has been designed to provide as much information about sixteenth- and seventeenth-century correspondence as possible. The very earliest Scottish letters extant in the archives date from circa 1400. Since these are included in the ECOS-Phase 1 database, which is an important manuscript-based source for Scottish documents dating from the fifteenth century compiled by Keith Williamson, the CSC is restricted to post-1500 letters. Despite continued browsing through the archives, the proportion of fifteenth-century letters in the corpus will remain small. While the number of fifteenth-century autograph letters cannot be expected to increase to any great extent, there is no lack of sources in the genre of correspondence for the later periods.

The present version of the corpus also comprises some letters dating from the first decades of the eighteenth century. The focus on the early part of the eighteenth century is primarily due to the fact that later letters reflect a widening of contacts with English writers, and it seemed important to examine epistolary prose in primarily Scottish networks before extending the corpus to include letters by informants regularly commuting between Scotland and England. See the word counts for more information on division of data by region, time-period, gender and dialect area.

Selection of data

The main criteria in the selection of data for the CSC are as follows:

Only original manuscripts of letters have been included; there are no letters which have been indicated in the catalogues to be later copies of the actual documents, or have been detected to be such by the compiler according to criteria such as type of handwriting and paper quality.

Priority has been given to autograph letters by a single writer, those by two or more writers being exceptions in the CSC. However, it has not always been possible to find conclusive evidence that a particular letter is indeed by the hand of the person who has signed it. Comparison of hands is not always possible, since the archives have had to limit the number of documents a reader is allowed to examine simultaneously. There may be other reasons for a particular hand remaining unidentified; for example, we may have only one single letter in a particular hand.

Among the sixteenth-century letters in particular, there are letters written by two different hands. In these, the most frequent pattern is that the body of the letter is in secretary hand and the signature, sometimes also the letter-closing formula, and, even less frequently, the initial term of address, are in a different hand, mostly resembling italic or a variety of the more rounded styles. In letters of this kind, the section in secretary is assumed to be non-autograph, while the signature (and the formulae) are considered autograph. The two hands are indicated by positioning the comment 'hand 1>' before the autograph sections and 'hand 2>' before the non-autograph ones.

While the chief goal has been to achieve diachronic, diatopic and diastratic representativeness, close attention has also been paid to ensuring that the proportion of letters written by and addressed to women does not remain too small (see the word counts for proportion of text by female writers by region and century and the total number of words per period and gender).

For more information, see the Manual to the Corpus of Scottish Correspondence:

See also

Meurman-Solin, Anneli. 2007. The Manuscript-Based Diachronic Corpus of Scottish Correspondence. In Joan C. Beal, Karen P. Corrigan and Hermann L. Moisl (eds.) Creating and Digitizing Language Corpora, vol. 2: Diachronic Databases, 127-147. Basingstoke: Palgrave Macmillan.

Lexico-grammatical annotation

The CSC has been annotated using software created by Keith Williamson, Institute for Historical Dialectology, University of Edinburgh. However, the basic tagging system has been elaborated to improve its applicability to research on historical syntax and typology in particular.

For more information, see: