3.2.4 The structure of a tag
A tag consists of two parts: a lexel, bracketed by $ (a dollar sign) and / (a forward slash), and a grammel, following / (a forward slash).
$letter/n_LETTER
$lexel/grammel_REALISATION
The lexical item that is actually attested, in this case letter, is given in upper case after an underscore.
The CSC system may give the general impression that elaboration makes it difficult to learn the tagging language. However, the order of components in tag strings is carefully controlled, and the hierarchy between different types of information is transparent. The sequences of features in the tags are decomposable. Thus it is always possible to give an example fitting the information given in the grammel completely, examples which are even minimally different never producing a perfect fit. The components are presented in the following order (the set of features given as an example represent a relative structure):
| core category/categories |
RN* |
'a relative pronoun, in the nominative' |
| contextual semantic comment |
{y2} |
'inanimate reference, plural' |
| contextual syntactic comment |
{non-ad} |
'non-adjacent position as regards antecedent/anchor' |
| contextual structural property |
pr> |
'a prepositional link to another element or elements' |
| relation to another tag or tags** |
<aj-sup |
'an adjective in the superlative in the antecedent' |
* Unlike other core properties, which are attached to one another by a hyphen, case is indicated by capital letters in pronoun paradigms.
** Only relations that have been recorded as relevant conditioning factors in the literature have been indicated.
In the following example, the tag of to whom indicates that the antecedent is the non-adjacent my lord kemnay:
my lord kemnay will more fully inform you, to whom I have here with written to that effect
$who/RO{+h1}{non-ad}<pr>v_WHOM
RO core properties
{+h1} a semantic comment
{non-ad} a syntactic comment
<pr and >v contextual structural properties (‘write to somebody')
Similarly, in a nominalization, the properties of the tag can be read as follows:
$before/pr-cj_BEFORE
$/P11G+C_MY
$come/vn{rc}-av>pr_COM+ING $/vn{rc}-av>pr_+ING
$to/pr<vn-av_TO
;_FRANCE
vn-av core properties
{rc} ‘reduced clause' a semantic comment on position on the cline of nouniness
pr> a contextual structural property
In addition to the fixed order of core properties, comments and relations, the symbols used for the different slots also permit the user to identify what type of information each unit in the string is intended to provide:
| core properties |
- |
in the grammel (cf. integration in the grammels of pronouns) |
| semantic comments |
{} |
in the lexel and/or the grammel |
| syntactic comments |
{} |
in the grammel or outside the tag (see 3.2.5.5) |
| structural comments |
<> |
in the grammel |
| relations between tags |
<> |
in the grammel |
The aim of this careful ordering of items is to ensure that relevant data can be retrieved using searches that represent varying degrees of refinement and specificity. Thus, it is possible to search according to the core property component(s), which are always positioned first in a string (e.g. $respect/n, the lexical item respect as noun), or build up a range of more complex searches which take advantage of the availability of more detailed information in the elaborated tag (e.g. respect/n-pr{foc}>pr, the same item in the potentially grammaticalized prepositional phrase in respect of with a focusing function, or respect/n-cj{c-tf-pre}, the same item in a conjunctive phrase in respect that, where the components in curly brackets indicate that the complementizer that is explicit {c}, the communicative function of the clause introduced by in respect that is topic-forming {tf}, and the clause precedes the main clause {pre}). The symbol used for each component in the tagging language is explained in the key to tags, so the user can modify a search by selecting any number of additional properties or search for the basic component only.
3.2.4.1 The lexel
The lexel functions as a representation of the more or less numerous variant forms that a particular lexical word may have in the data. Most function words are also represented by an emic word-form in the lexel (e.g. have and be as auxiliaries in compound tenses and the passive have the lexels $have/ and $be/ respectively). However, the lexel slot in tags for inflectional morphemes is left empty ($/vpt instead of $-ed/vpt for past tense). Similarly, only the grammel is provided for personal, reflexive and demonstrative pronouns, as well as definite and indefinite articles (e.g. $/P11N, $/P12N, $/P13NM, $/P13NF, $/P13NI for personal pronouns in the nominative singular). Another important category that is tagged with the grammel only is that of zero-realisations of relatives (e.g. $/0RN{y1}, a zero relative as subject with a singular inanimate noun as antecedent).
While the lexel slot is left empty with inflectional morphemes (e.g. $/vpp_+ED), with derivational morphemes the affix is given (e.g. $-ness/xs-n_+NESS). The user should note that in the present system the element -ly in open-class adverbs has been treated as derivational (e.g. $-ly/xs-av_+LY), including in cases in which there is variation between a variant with - ly and one that has been tagged as a so-called zero-realisation of the derivational morpheme (e.g. $right/av_RIGHT+LY $-ly/xs-av_+LY alternating with $right/av_RIGHT $-ly/xs-av_+0).
The choice of word-form to represent a particular linguistic item in the lexel is governed by the following principles: to permit the flexible use of the CSC for comparative research on varieties of English, a Present-day English equivalent is given as the lexel, except in cases where a feature has only been recorded in Scots or has been identified as distinctively Scots (cf. the comment 'chiefly Scottish' in the OED). The lexel usually consists of one word; compounds and collocates, for instance, are split into their component parts at the level of basic tagging (for information on tagging relations between such items, see the section on compounding). The base form is used in derivatives in which lexical morphemes are transparent. Evidence of transparency can be observed in English coinages which represent a productive pattern of word-formation, either as regards a particular derivational morpheme or on the basis of analogy. For example, the adjective glorious is represented by the lexel glory, and the verbal noun misunderstanding by understand, whereas morphologically opaque items or loanwords containing elements which have not become productive in English remain unanalysed. Lexicalized items such as notwithstanding or loanwords such as molestation in legalese are left unanalysed, except in the former case as regards variation between the morpheme -ing and -and in Older Scots. The online OED (Oxford English Dictionary http://dictionary.oed.com/) and the DSL (Dictionary of the Scots Language http://www.dsl.ac.uk/) have been consulted for information on the lexico-grammatical history of a given item. In unclear cases, preferred practice is to tag components as separate units. For example, therefore with text-structuring function signalling a causal relation is a single unit in the lexel, whereas therefor(e), 'for it' is represented by two lexels, there and for.
Information to assist semantic disambiguation is provided in comments in curly brackets immediately following the lexel. The principle in the present version of the corpus is to treat one of the meanings in a polarized system of two as a default property, and to add the semantic comment to the other (e.g. $since/cj for since with temporal meaning, as compared with $since{cause}/cj for since with causal meaning; $yet/av for yet with temporal meaning, as compared with $yet{conc}/av for yet with concessive meaning). In ambiguous cases, the alternative readings are made explicit in the comment ($since{time&cause}/cj). In non-dichotomous systems, each occurrence is tagged with a semantic comment (e.g. in the case of polysemous as and while). The rationale for defining a particular property as a default value is that this makes the tagging process more economical. The user hopefully also benefits from comments such as {allow} attached to the verb suffer in the sense of permission, {purpose} attached to the conjunction that to distinguish it from its other adverbial uses and from the complementizer that, and {same} attached to the numeral one ($1{same}/qc-n) to indicate this particular sense. Some high-frequency prepositions are also given comments; for example, the comment {until} is added to to as an equivalent of until ($to{until}/pr). Similarly, {place} attached to the preposition of refers to its use with the meaning 'from' in place adjuncts. The present tagger's choice of features to be commented on is of course subjective, and the user may find it useful to examine the system in more detail, in order to be able to revise or refine it for a particular study. For an exercise of this kind, the creation of a concordance of lexels with attached comments is recommended. In general, the key to comments will provide sufficient information.
The lexel is also that unit in the tag in which information about discourse functions can be provided:
$/T_*THE
$lord/n{tl}>pr_LAIRDE
$of/pr<n_OF
;_* BOYNE
{\}
$have/vpt13<cnp+_HAD
$appoint{cause}/vpp{ptp}_APOINT+ED $/vpp{ptp}_+ED
$/P11O_ME
$to/im+C_TO
$meet/vi-av_MEITT
$/P13OM_HIM
In the complementation of the causative act of appointing, the infinitive is tagged vi-av, since it varies with finite adverbial clause alternatives attested in the data.
The comment {rep} attached to the lexel indicates that an item is used as a proform, i.e. to avoid repeating a constituent in the preceding context, and arrows are used to permit a search for the word which has been substituted.
$/P02G_YOUR
$happy/aj_HAPIE
$delivery/n{rc}>pr_DELIUYUERIE
$of/pr<n_OF
$/A+C_A
$young/aj_YOUNG
'_*CHARLES $/n>av_*CHARLES
$for/cj{ts}{tf}_FOR
{\}
$so{rep}/av<n_SO
$/P02G_YOUR
$father/n_FATHER
$call/vps13<n+_CALL+ES $/vps13<n+_+ES
$/P13OM_HIM
$/A+C_A
$knave/n_KNAWE
$that/RN{+h1}_THAT
$have/vpt13<R+_HED
$wrong/vpp{ptp}>v_WRONG+ED $/vpp{ptp}>v_+ED
$/P11O_ME
$so/av>cj_SO
{\}
$much/av_MUCHE
$as{comp}/cj<av_AS
$/P13NM_HE
$have/vps13<P+_HES
$do{rep}/vpp{psp}<v_DOWNE
So and do are the most frequent items in the data to receive the comment {rep}.
Latinate constructions such as object + infinitive, present participle or past participle in the complementation of particular verb categories can be found by searching for the comment {lat} attached to the lexel:
$/P11N_j
$pray{cause}{lat}/vps11<P+_PRAY $/vps11<P+_0
{in margin>}
{zero that&Oinf}
$/P02O_YOU
$let{lat}/vsjps02<P+{nom}_LETT $/vsjps02<P+{nom}_0
$/P11O_ME
{zero im}
$know/vi{-im}-av_KNOE
$what/pn_WHAT
$be/vps13<pn+_IS
$say/vpp{pass}>pr-cj_SAIED
$of/pr-cj<v_OF
$/P21G_OUR
$stay/n{rc}_STAYE
Some of these Latinate constructions are ambiguous and allow two different readings. These have been specified in the comment: the option 'zero that' suggests that the complementation is a nominal that-clause with the so-called that-deletion, while the second option 'Oinf' interprets the construction as object + infinitive (Latin accusativus cum infinitivo). Since one of the two options will have to be chosen in order to provide a tag for the verb, the following practice has been applied: with verbs attested in object + bare infinitive constructions (such as let) the structure is read as an Oinf (let me know), whereas with other verbs (such as pray) the option {zero that} followed by a nominal clause with the verb in the subjunctive (pray that you let me) is selected. This decision is based purely on pragmatic concerns; it aims to facilitate data retrieval and does not draw on thorough research on the subject.
3.2.4.2 The grammel
The order in which properties are specified in the grammel is fixed, and the boundary between the lexel and the grammel is marked by a forward slash ($sister/n). The first item immediately following the slash states the core property of the item in terms of word class or part of speech. A set of co-ordinates on a cline is often required to provide relevant information about a particular feature in a particular context; the components are then listed in a hyphenated string of properties. The order of components in the grammel is carefully controlled, and the hierarchy between different types of information is transparent. The following two uses of because will illustrate how the subordinator use of this item can be distinguished from the prepositional use by perusal of the grammel:
because (that)
$by/pr-cj_BE+
$cause/n{rc}-av-cj_+CAUSE
$that/cj<_THAT
because of
$by/pr-cj_BE+
$cause/n{rc}-av-pr>pr_+CAUSE
$of/pr<n-av-pr_OF
These co-ordinates, which situate the items on a cline, are a useful method of reflecting the inherent fuzziness or polyfunctionality of features, and of providing relevant information in certain types of circumstances. In order to trace developments over a long time-span, especially grammaticalization and lexicalization processes, it is necessary to indicate the various stages, beginning with the origin, listing properties perceived in the analysis of pragmatic inferences, identifying examples which provide evidence of a process of grammaticalization, and stating the grammaticalized use and any further developments.
Supplementive adjective clauses (Quirk & al. 1985: §7.27-29) may be realised by a single unit, as in the following example:
$/P23N_THEY
$will/vm_WILL
$come/vi>pr_COME
$safe/aj-av_SAFE
$to/pr+C<vi_TO
$/P02G_YOur
$lord/nG{ho}_LO+P+^S $-ship/xs-nG{ho}_+P+^S $/Gn{ho}_+^S
{\}
$hand/npl-av_HAND+S $/pln-av_+S
The adjective safe is tagged aj-av, the two core properties defining the variational space in which this particular use is analysed. (Lops in this extract is a contracted form of the genitive lordship's.)
A three-unit cline occurs in the following example:
$which/RN{sent}_WHICH
$make/vpt_MADE
$/P11O_ME
$/T>av_THE
$more/av<T_MORE
$forward/aj-cpv>vi_FORWARDE
$to/im+C_TO
$come/vi<aj-cpv_COME
{\}
$prepare/vpp{pass}-aj-av>vi_PREPAR+ED $/vpp{pass}-aj-av>vi_+ED
$to/im+C_TO
$do/vi<vpp-aj-av_DOE
$some/pn-aj>n-pn_SOME+
$thing/n-pn<pn-aj_+THING
The grammel listing the properties of prepared consists of information about the verb form (vpp), the voice {pass}, and the adjectival use of a participle (vpp{pass}-aj), which shares the variational space with adverbials (-av) and the type of complementation (>vi). The tag for accompanied in the following example is the same except for the indication of a prepositional complement.
$/T_*THE
$lord/n{tl}>pr_LARD
$of/pr<n_OFF
;_*ADIE
$meet/vpt_MET
$/P13OM_HIM
$accompany/vpp{pass}-aj-av>pr_ACCOmPAN+EIT $/vpp{pass}-aj-av>pr_+EIT
$with/pr<vpp-aj-av_W^T
$/P11G+C_*MY
{\}
$lord/n{tl}_LORD
‘_*OGILVAY
In principle, the maximum of units in a core-property string is four. Four-unit (or the very infrequent longer) sequences chiefly occur in grammels providing information about lexical morphology. In the following example, the core properties of the prefix (xp) and suffix (xs) of an adverb are listed:
$deserve/vpp{pass}-aj-av_VN+DESERU+ED+LIE $un-/xp-vpp{pass}-aj-av_VN+ $/vpp{pass}-aj-av_+ED+ $-ly/xs-vpp{pass}-aj-av_+LIE
The order of information is from the category of the lexical morpheme (xp) to the properties of the base, which in this case is a participial adjective with passive function (vpp{pass}-aj), the category of this combination in this particular context being given as the last tag (av). As mentioned earlier, in the present system the morpheme indicating open-class adverbs is treated as a derivational morpheme (xs).
Longer core-property strings also occur in compounds (-k):
$by/pr_BY
$/P13GM_HIS
$majesty/nG{ho}_*MAJESTI+ES $/Gn{ho}_+ES
$high/aj{tl}_HIGH
$commission/n-av_*COMMISSION+ER $-er/xs-n-av_+ER
The constituents of this compound are related to one another by arrows, and -av{pass} indicates that the compound realises the agent in a passive clause.
3.2.4.3 Comments within tags
Each core property may have a comment, but comments attached to the one positioned first are significantly more frequent. This can be illustrated using the example of grammels attached to relatives, in which animate, human or non-human, or inanimate reference, as well as the number of the antecedent, are indicated in comments in curly brackets. The following summary also illustrates systems in which the tagged properties specified in the comments are mutually exclusive, there being no default properties:
- The core properties RN nominative, RO oblique, RC complement, RX adverbial and Raj attributive cover the whole variational field.
- Number of the antecedent: singular, plural, other ('other' is tagged {y0}, {+h0} or {-h0}).
- The tags {y1}, {y2}, {y0}, {+h1}, {+h2}, {+h0}, {-h1}, {-h2}, {-h0} and {sent} are mutually exclusive. The comment {sent} is used when the anchor is a nominalization, as illustrated by the example below. (As discussed in Meurman-Solin (2007), the concept of anchor is useful for the analysis of anaphoric reference in historical texts, a syntactic constituent functioning as an antecedent being distinguished from the anchor of a reference as identified and defined by semantic criteria (see Huddleston and Pullum 2002: 1353-4).)
- Definiteness of the antecedent: generic/indefinite, other. The tags {y0}, {+h0} and {-h0} are used for generic or indefinite and {y1}, {y2}, {+h1}, {+h2}, {-h1}, {-h2} for definite reference.
- Animacy of the antecedent: animate human, animate other, inanimate. Any property in the set of {y1}, {y2}, {y0}, {+h1}, {+h2}, {+h0}, {-h1}, {-h2} and {-h0} automatically excludes the option {sent} and vice versa.
There may be more than one comment following the core property. A comment on non-adjacent position or discontinuity is positioned after a semantic comment. For example, in the relative system the string /RO{y2}{non-ad} indicates a relative pronoun in the oblique case which refers to a non-adjacently positioned inanimate plural noun antecedent. The comment {disc} is used to tag intervening relatives in discontinuous nominal structures:
$/T_THE
$great/aj_GREAT
$care/n{rc}>vi_CAIR
{\}
{zero rel}
$/0RO{sent}{disc}_0
$/P02N_YOU
$have{n}/vps02<P+_HAUE
$to/im+C_TO
$see{lat}/vi{non-ad}<n_SIE
$thing/npl_THING+S $/pln_+S
$settle/vpp{pass}-av_SETL+ED $/vpp{pass}-av_+ED
$betwix/pr_BETUIX
{\}
$/P11G+C_MY
$nephew/n-av_*NEPHEWE
{appositive}
$/T_THE
$earl/n{tl}{app}-av>pr_*EARLL
$of/pr<n-av_OF
;_*ATHOLL
{\}
$&/cj_AND
$/P11O-av_ME
In this example, the relative structure introduced by a zero link occurs between the noun care and an infinitive clause which constitutes a discontinuous nominal structure. The non-adjacent position of the infinitive clause is also explicitly indicated in the grammel of the verb see. Comments providing information about text-structuring function ({ts}), topic-forming potential ({tf}), or formulaic use ({f}) are positioned in this order at the end of a string; for example, the order of the three types of comments in the grammel of a relative construction would be the following:
| {+h1} |
semantic properties of the antecedent |
| {non-ad} |
position of the antecedent |
| {f} |
discoursal properties |
In terms of the positioning of semantic comments, core-property strings call for a specific practice to achieve as high a degree of transparency as possible. The semantic comment is attached to the item it relates to instead of being positioned immediately after a multi-unit string of categories. The tag /RO{sent}-cj, for example, indicates a sentential relative functioning as a connective at the discoursal level (for more information on levels of analysis, see Meurman-Solin 2007).
In addition to the comments in curly brackets discussed above, nominalizations can be identified by the comment {rc}, for 'reduced clause', which is attached to the core property. The modelling of variation between finite clauses and their reduced variants, which represent various degrees of nominalization, is an essential feature of the theoretical approach adopted here. How transparent a verbal process is in a nominalization is not a decisive factor in the identification of the so-called reduced variants:
$to{until}/pr+C-cj_TO
$death/n{rc}-av_DAITH
This prepositional phrase occurs as part of an epistolary formula frequently used at the end of a letter as an avowal of loyalty; a similar meaning is expressed elsewhere using adverbial clauses such as till/until/unto the time (that) I die. In order to enable the user to search for temporal expressions of both types, the core property string pr-cj is used to tag the preposition (+C referring to the initial consonant of the following word, relevant in the study of variation between to and till), and the noun death in an adverbial phrase is tagged n{rc}-av. While the tag pr-cj only occurs in prepositional structures with a nominalization as head or complement, n{rc} is often further described by links to contextual properties:
$/P02G_YOUR
$happy/aj_HAPIE
$delivery/n{rc}>pr_DELIUYUERIE
$of/pr<n_OF
$/A+C_A
$young/aj_YOUNG
'_*CHARLES
The writer refers to the addressee having given birth to a son, the deverba l noun derivative delivery being used instead of a clausal variant such as that you have delivered or you having delivered.
In the following example, the word silence has been tagged as a nominalization.
$after/pr-cj_AFTER
$2/qc_2
{\}
$year/nGpl_YEAR+S $/Gpln_+S
$silence/n{rc}-av_SILENCE
From the perspective of lexical morphology, loanwords such as silence have not been considered English coinages and are not tagged as deadjectival noun derivatives (cf. $good/n_GOOD+NESS and $-ness/xs-n_+ NESS). However, since the tagging system here also takes into account variation resulting from different ways of processing information, the tag n{rc} permits the user to position this occurrence on a cline, contrasting reduced and unreduced relational processes (after you had been silent for two years).
No distinction is made between gerunds, verbal nouns and deverbal nouns (see Quirk & al. 1985: §17.52-54): the grammel vn{rc} is used with all of them. (However, in contrast to the practice adopted in Quirk & al. 1985, in which the term '-ing participle' is used to refer to both present participles and gerunds, the tags in the present system make a distinction between the two, using the tag vpsp for the former and vn for the latter.)
$at/pr-cj_AT
$/P11G+C_MY
$come/vn{rc}-av>pr>pr_CUm+ING $/vn{rc}-av>pr>pr_+ING
$from/pr<vn-av_FROM
;_jYRLAND
$to/pr+V<vn-av_TO
;_Edinburgh
This nominalization alternates with a number of other potential realisations, such as when I came from Ireland to Edinburgh.
3.2.4.4 Contextual properties
Information about the structural properties of the context is positioned after the core- property categories and comments directly related to them. Since the main principle is to avoid suggesting a particular syntactic reading, this information only records relations between the tagged item and other items in the context, in order to enable refined searches. In other words, the user can restrict a search by excluding occurrences which do not fit a particular set of contextual features, for instance. Relations considered relevant are tagged with pairs of arrows:
$/T_THE
$occasion/n>pr-cj_OCCASION
$of/pr-cj<n_OF
$/P11G+C_MY
$write/vn{rc}>pr_VRYT+TIN $/vn{rc}>pr_+TIN
$unto/pr+C<vn_VNTO
{\}
$/P02O_*ZOU
$at/pr_ATT
$/Dis_THIS
This complex noun phrase with the noun occasion as its head includes a prepositional phrase as a postmodifier. This relation is indicated with n>pr-cj and pr-cj<n. Since the preposition introduces a nominalization of the act of writing to a recipient at a particular point in time, the relation between the verb and the prepositional object (the verb representing the prepositional ditransitive complementation type: to write something to somebody) is tagged with vn{rc}>pr and pr+C<vn; comments are omitted in the addresses of items forming a pair (thus <vn instead of <vn{rc}). The expression of time is an adverbial adjunct in the clause of which this postmodifier is a nominalization; since time adjuncts of this kind have no potential for reanalysis as regards the complementation of a verb like write, there is no link from the preposition at to vn{rc}. However, the grammel /n of time indicates the modifier status of the phrase at this time and distinguishes it from nominal heads in prepositional phrases as clause-level adverbials, the latter being given the grammel n-av.
The verb complementation types of multi-word verbs (phrasal verbs, prepositional verbs and phrasal-prepositional verbs) can be examined by drawing on this system of tags which explicitly indicate contextual relations:
$&/cj_AND
$/neg>vi_NOTT
$to/im+C_TO
{-}
{\}
$make/vi>pr_MAK
$any/pn-aj_ANY
$use/n{rc}>pr_VISs
$of/pr<vi_OF
$/P13OI_ITT
$for/pr<n_FOR
$/P11X_MY-SELF $-self/xs-P_-SELF
The prepositional ditransitive use of the verb make (make use of something) is tagged with vi>pr and pr<vi, the relation between the head (use) and modifier (for myself) within the nominalization being indicated by n{rc}>pr and pr<n.
When the prepositional object is a nominalization, the preposition attached to the verb has two core properties, namely 'pr-cj'. In the example They suspended him from being the chairman, the prepositional verb is tagged as follows:
$suspend/vpt>pr-cj
$him/P13OM
$from/pr-cj<v
$be{n}/vn{rc}
This practice can also be illustrated using the tagging of relative structures:
$/T_THE
$principal/aj_PRINCSPAL
$sum/n_SOUME
$for/pr>R>v_FOR
$/T>R_THE
$which/RO{y1}<T<pr>v_QUHILK
$/P11N_j
$stand/vps11<P+<pr<R_STAND $/vps11<P+<pr<R_0
{\}
$caution/n_CATION
The items stand caution for, irrespective of how one defines idioms or idiomaticity, can be considered a co-occurrence pattern, or a collocate. Therefore, the tags provide links which users of the database may find relevant in designing a search. The features subject to elaboration have been selected to restrict the focus to the structural features of the verb. The element <pr attached to the grammel of the verb phrase /vps11<P+ refers back to >v in the grammel of for, while <R indicates that the complement of for is a relative, its pair, >v, being in the grammel of which. The rest of the links provide information about the relative structure, stating that the variant is the which and that the relative is the complement of a preposition. In my view, both perspectives are necessary, the analysis of the verb complementation type providing relevant information for the analysis of the relative construction.
If the preposition is stranded, it is tagged as follows:
$/Dat>R_*THAT
{\}
$letter/n_LETTER
{zero rel}
$/0RO{y1}<Dat>v>pr>_0
$/P11N_j
$write/vpt>pr<R_WROTTE
$of/pr<<v<R_OF
The prepositional monotransitive use of the verb write calls for the creation of a link between the verb and the preposition: >pr and <v. The complement of the preposition is a zero relative; its grammel indicates that the preposition attached to the verb (>v) is stranded (>pr>). A further link permits the study of questions such as which determiners or attributes in the antecedent favour the use of a zero relative; in this example arrows are positioned in the grammels of the demonstrative (Dat>R) and the zero relative (0RO{y1}<Dat). As discussed in Section 3.2.3, zero-realisations are full members in variationist typologies, and therefore links are also created between explicitly expressed items and those that remain implicit, irrespective of whether the latter have separate tags, as in the above example, or are marked with a comment:
$/P11N_j
$will/vm_WILL
$still/av_STILL
$entreat{cause}{lat}/vi_ENTREATT
$/P02G_YOWR
{.}
$lordship/n{ho}_*L
{.}
$to/im+C_TO
$recommend/vi-av>pr_RECOM\MEND
$/P11O_ME
$to/pr+H<vi-av_TO
$/P13GM_HIS
$majesty/n{ho}_*MA
{.}
$in/pr_IN
$/Dat_THATT
$business/n-av_BISSINES
{ins}
$how/av>cj_HOW
$soon/av-cj_SCHONE
{zero cj<av}
$/P02N_YOW
$can/vm_CAN
{zero vi}
Since the variant in bold shares membership in the same variationist typology as the connectives so/as/how soon as/that, a link has been created between av>cj and the zero-realised cj<av. Personally, I find a separate tag with an initial zero more appropriate for my own research, but in the present version of the CSC I have decided to use the two complementary systems.
3.2.4.5 Tag-external comments
Comments may also be positioned outside the tag. The addition of comments of this kind is motivated by the fact that some features cannot be found simply by searching for a basic property or a particular lexel, or even by taking advantage of the elaborated information in the grammel. A case in point is the appositive structure, which consists of juxtaposed nominal structures, often representing different degrees of structural complexity. In the system applied to the CSC data, appositive structures can be found by searching for either the independent comment {appositive} positioned between the two units, or the component {app} in the grammel of the second unit, the conjunction in the case of nominal clauses:
$/S_THER
$be{n}/vps13<S+_IS
$/A+C_A
$word/n<S_WORD
{appositive}
$that/cj{app}_THAT
$/P13NM_HE
$be{n}/vps13<P+>vi_IS
$to/im+C_TO
{\}
$go/vi<v>av_GOE
$away/av<vi_AWAY
Since the second unit in this appositive structure is a nominal that-clause, the comment {app} is added to the grammel of that. The property <S attached to the notional subject refers to the grammatical subject there in the preceding context. Instead of presenting a conclusive analysis, this practice will allow the creation of inventories for studying the history of so-called existential sentences.
A number of these comments describe clause- and sentence-level features related to information processing, the most important of these being inversion and fronting (information on extraposition is included in the grammel, as illustrated by the example above). A single arrow attached to a comment in curly brackets refers to a structural feature of the following chunk of text:
$neither/neg-cj{ts}>neg-cj_NETHER
{inversion>}
$do/vps23>P+{neg}>vi_DOE $/vps23>P+{neg}>vi_0
$/P23N_THEY
$nor/neg-cj<neg-cj_NOR
{inversion>}
$can/vm_CAN
$/P23N_THEY
{\}
$know/vi<v_KNOW
$/P23G_THER
$own/aj_OWNE
$strength/n{rc}_STRENTHE
It may seem to be possible to search for this construction using the grammel of the verb phrase, since >P+ indicates that a pronoun subject follows the predicate verb. However, since word order is not indicated in the present praxis of tagging modal verbs, for example, a comprehensive inventory of inverted word order can only be created by searching for the independent comment. If inversion indicates subordination, this is also pointed out in the comment:
{inversion indicating subordination>}
{cond}
$be/vsjps23>P+{cond}_WERE
$/P23N_THEY
{\}
$pay/vpp{pass}>av_PAY+D $/vpp{pass}>av_+D
$of/av<v_OFF
The clause functions as an adverbial clause of condition.
Comments indicating fronting specify which constituent has been fronted. The following two examples are variants of structures which include the extraposition of a clausal subject:
{zero pre}
$/P11N_j
$be{n}/vps11<P+_AM
$very/av_WERY
$glad/aj>vi_GLAID
$to/im+H_TO
$hear/vi<aj_HEIR
{\}
$that/cj_THAT
{fronted S>}
$/P02G_zOUR
$lordship/nG{ho}_LO $/Gn{ho}_0
{:}
$action/n{rc}_ACTION
$be{n}/vps13<cnp+_IS
$like/aj_LEIKE
$to/im+C_TO
$come/vi<S>pr-cj_COME
$to/pr-cj+C<vi_TO
$some/pn-aj_SOME
$good/aj_GOOD
$end/n{rc}-av_END
The construction alternates with it is likely that your lordship's action will come to some good end. In order to create a full inventory of variation in the use of the formal subject, the anticipatory it, the grammel of the infinitive come contains <S, a property attached to all extraposed subjects. This practice is also illustrated by the following example, in which the object of an embedded clause has been fronted:
{fronted O>}
$/T_THE
{\}
$particular/aj-npl_PARTICULAR+S $/aj-pln_+S
$be{n}/vsjpt23<aj-npl+_VAR
$long/aj_LONG+SUM $-some/xs-aj_+SUM
$to/im+C_TO
$write/vi<S_VRYTT
Object fronting of the following kind is also attested in the data:
{zero pre}
{fronted O>}
$/P02G_YOUR
$last/aj-n_LAST
$/P11N_j
$have/vps11<P+_HAWE
$receive/vpp{psp}_*RESAU+ED $/vpp{psp}_+ED
$with/pr_W^T
$/A+V>pn-av_AN+
$other/pn-av<A>pr_+OTHER
{\}
$from/pr<pn-av_FROM
$/T_THE
$earl/n{tl}>pr_*EARLL
$of/pr<n_OF
;_SEAFORT
When a pronoun acts as a substitute for the fronted constituent, a link between the fronted item and the object pronoun in the default position of an SVO sentence is indicated by arrows:
{zero pre}
{fronted O>}
$/T_THE
$purpose/n{rc}>P_PURPOS
$/P02N_*ZE
$know/vps02<P+_KNAU $/vps02<P+_0
{\}
$/P13OI<n_ITT
In the following example, a prepositional phrase functioning as an adjective complement has been fronted:
{zero pre}
{fronted complement>}
$for/pr-cj>aj_FOR
$/P13GM_HIS
$courage/n{rc}_CURAGE
$/P11N_j
$think/vps11<P+_THINK $/vps11<P+_0
{zero that}
$/P11N_j
$may/vm_MAY
$be{n}/vi_BE
$answer/aj<pr-cj_ANSUER+A\BLE~ $-able/xs-aj<pr-cj_+A\BLE~
Similarly, the fronting of an adverbial is indicated with a comment:
{fronted av>}
$/Dis_THIS
$summer/n-av_SOMMER
$true/av_TREW+LY $-ly/xs-av_+LY
$/P13NI_IT
$can/vm_CANE
{\}
$/neg<v_NOT
$be{n}/vi_BE
Features of visual prosody are described by comments inserted in the tagged text. Marked or ambiguous character shapes are described, and general comments are positioned before the body of the text. Insertions and deletions are both tagged and marked with a comment:
{zero pre}
$/P11N_j
{ins}
$do/vpt>vi_DID
{ins}
$write/vi<v>pr>pr_WRYTT
$unto/pr+C<vi_VNTO
$/P02G_*ZOUR
$lordship/n{ho}_LO
$before/pr_BEFOR
$christmas/n-av_CRISSIN\MESS
$in/pr<vi_IN
$/A+C_A
$particular/aj-n_PARTICULAR
As this example illustrates, the recording of an item as an insertion can provide linguistically interesting information.
Since extra space between words in the manuscript may be relevant in interpreting sentence and clause structure, the comment {space} is added in these situations.
3.2.4.6 Morphological analysis
Since the general approach here is variationist, both inflectional and derivational morphemes are analysed by including in them any final elements of the base that reflect a high degree of variation. This practice is purely technical, in the sense that it is assumed to permit the production of inventories which are much more informative than those which include only an inflectional ending or a suffix. For example, plural nouns are analysed as follows:
$pain/npl_PEAN+ES $/pln_+ES
$hope/npl{rc}_HOP+ES $/pln{rc}_+ES
$knave/npl_KNAW+EIS $/pln_+EIS
$parcel/npl_PARSEL+Lis $/pln_+Lis
There are cases in which items written with flourishes may be interpreted as plural forms, and the following morphological analysis permits the identification of these ambiguous cases (^ indicates that the abbreviation is a superscript in the original):
$pound/nqpl_^L^L +^~ $/plnq_ +^~
The base and a derivational suffix are analysed as follows:
$amend/n{rc}_*AMEND+EMENT $-ment/xs-n{rc}_+EMENT
Similarly, the treatment of inflectional morphemes in verbs can be illustrated with the following examples:
$carry/vps13<P+_CARI+ETH $/vps13<P+_+ETH
$move/vpt_MOW+EIT $/vpt_+EIT
$revenge/vn{rc}_REWENG+EING $/vn{rc}_+EING
$have{n}/vn{rc}_HAV+EING $/vn{rc}_+EING
$remit/vpsp_REMIT+TING $/vpsp_+TING
$settle/vpp{pass}_SETL+ED $/vpp{pass}_+ED
$get/venpp{psp}_GOT+TEN $/venpp{psp}_+TEN
$please/vpp{pass}-aj_PLEAS+ET $/vpp{pass}-aj_+ET
The tagging of the comparative and superlative forms of adjectives reflects the same practice:
$wise/aj-cpv_WYS+SER $-er/xs-aj-cpv_+SER
$up/aj-cpv_WP+PER $-er/xs-aj-cpv_+PER
$late/aj-sup_LET+TEST $-est/xs-aj-sup_+TEST
The same analytical principle is applied to both native and borrowed items:
$gracious/aj_GRAT+IOUS $-ous/xs-aj_+IOUS
$love/vpsp-aj_LOW+EING $/vpsp-aj_+EING
Since the adjective has not been considered an English coinage, the lexel is gracious rather than grace.
References
Anderson, John M. 1997. A Notional Theory of Syntactic Categories. Cambridge: Cambridge University Press.
Brinton, Laurel J. 2007. 'Rise of the adverbial conjunctions {any, each, every} time'. In: Connectives in the History of English, ed. Ursula Lenker and Anneli Meurman-Solin. (Current Issues in Linguistic Theory, 283). Amsterdam: Benjamins, 77-96.
Hopper, Paul J. and Sandra A. Thompson 2004. 'The Discourse Basis for Lexical Categories in Universal Grammar'. In: Fuzzy Grammar, ed. Bas Aarts, David Denison, Evelien Keizer and Gergana Popova. Oxford: Oxford University Press, 247-291. First published in Language 60 (1984): 703-52.
Houston, R. A. 1985. Scottish Literacy and the Scottish Identity. Illiteracy and society in Scotland and northern England, 1600-1800. Cambridge: Cambridge University Press.
Huddleston, R. and G. Pullum 2002. The Cambridge Grammar of the English Language. Cambridge: Cambridge University Press.
Jackendoff, Ray. 2002. Foundations of language: brain, meaning, grammar, evolution. Oxford: Oxford University Press.
Jucker, Andreas H. 1991. 'Between Hypotaxis and Parataxis: Clauses of Reason in Ancrene Wisse'. In: Historical English Syntax, ed. Dieter Kastovsky. (Topics in English Linguistics, 2). Berlin: Mouton de Gruyter, 203-220.
Kohnen, Thomas 2007. ''Connective profiles' in the history of English texts: Aspects of orality and literacy'. In: Connectives in the History of English, ed. Ursula Lenker and Anneli Meurman-Solin. (Current Issues in Linguistic Theory, 283). Amsterdam: Benjamins, 290-308.
Laing, Margaret 1993. Catalogue of Sources for a Linguistic Atlas of Early Medieval English. Cambridge: D.S. Brewer.
Laing, Margaret 2002. 'Corpus-provoked Questions about Negation in early Middle English'. Language Sciences 24: 297-321.
Laing, Margaret 2004. 'Multidimensionality: Time, Space and Stratigraphy in Historical Dialectology'. In: Methods and Data in English Historical Dialectology, ed. Marina Dossena and Roger Lass. Bern: Peter Lang, 49-96.
Laing, Margaret and Keith Williamson 2004. 'The Archaeology of Medieval Texts'. In: Categorization in the History of English, ed. Christian J. Kay and Jeremy J. Smith. Amsterdam: Benjamins, 85-145.
Lehmann, Christian 1988. 'Towards a Typology of Clause Linkage'. In: Clause Combining in Grammar and Discourse, ed. John Haiman and Sandra A. Thompson. (Typological Studies in Language, 18). Amsterdam: Benjamins, 181-225.
Lenker, Ursula 2007. ' Forhwi 'because': Shifting deictics in the history of English causal connection'. In: Connectives in the History of English, ed. Ursula Lenker and Anneli Meurman-Solin. (Current Issues in Linguistic Theory, 283). Amsterdam: Benjamins, 193-227.
Lenker, Ursula and Anneli Meurman-Solin (eds) 2007. Connectives in the History of English(Current Issues in Linguistic Theory, 283). Amsterdam: Benjamins.
Marshall, Rosalind K. 1983. Virgins and Viragos. A History of Women in Scotland 1080 to 1980. London: Collins.
Meurman-Solin, Anneli 1992. 'On the morphology of verbs in Middle Scots: present and present perfect indicative'. In: History of Englishes. New Methods and Interpretations in Historical Linguistics, ed. Matti Rissanen, Ossi Ihalainen, Terttu Nevalainen and Irma Taavitsainen. Berlin: Mouton de Gruyter, 611-623. Reprinted in Meurman-Solin (1993).
Meurman-Solin, Anneli 1993. Variation and change in early Scottish prose. Studies based on the Helsinki Corpus of Older Scots. (Annales Academiae Scientiarum Fennicae, Diss. Humanarum Litterarum, 65). Helsinki.
Meurman-Solin, Anneli 1999. 'Letters as a Source of Data for Reconstructing Early Spoken Scots'. In: Writing in Nonstandard English, ed. Irma Taavitsainen, Gunnel Melchers and Päivi Pahta. Amsterdam: Benjamins, 305-322.
Meurman-Solin, Anneli 2000. 'On the conditioning of geographical and social distance in language variation and change in Renaissance Scots'. In: The History of English in a Social Context. A Contribution to Historical Sociolinguistics, ed. Dieter Kastovsky and Arthur Mettinger. Berlin: Mouton de Gruyter, 227-255.
Meurman-Solin, Anneli 2001. 'Women as Informants in the Reconstruction of Geographically and Socioculturally Conditioned Language Variation and Change in the 16th and 17th Century Scots'. Scottish Language 20: 20-46.
Meurman-Solin Anneli 2002. 'Simple and complex grammars: The Case of Temporal Subordinators in the History of Scots'. In: Variation Past and Present. VARIENG Studies on English for Terttu Nevalainen, ed. Helena Raumolin-Brunberg, Minna Nevala, Arja Nurmi and Matti Rissanen. (Mémoires de la Société Néophilologique de Helsinki, 61). Helsinki: Société Néophilologique, 187-210.
Meurman-Solin, Anneli 2004a. 'From Inventory to Typology in English Historical Dialectology'. In: New Perspectives on English Historical Linguistics, Volume I: Syntax and Morphology, ed. Christian Kay, Simon Horobin and Jeremy Smith. Amsterdam: Benjamins, 125-151.
Meurman-Solin, Anneli 2004b. 'Towards a Variationist Typology of Clausal Connectives. Methodological Considerations Based on the Corpus of Scottish Correspondence'. In: Methods and Data in English Historical Dialectology, ed. Marina Dossena and Roger Lass. (Linguistic Insights. Studies in Language and Communication, 16). Bern: Peter Lang, 171-197.
Meurman-Solin, Anneli 2004c. 'Data and Methods in Scottish Historical Linguistics'. In: The History of English and the Dynamics of Power, ed. Ermanno Barisone, Maria Luisa Maggioni and Paola Tornaghi. Alessandria: Edizioni dell'Orso, 25-42.
Meurman-Solin, Anneli 2005. 'Women's Scots: Gender-Based Variation in Renaissance Letters'. In: Older Scots Literature, ed. Sally Mapstone. Edinburgh: John Donald, 424-440.
Meurman-Solin, Anneli 2007a. 'Relatives as sentence-level connectives'. In: Connectives in the History of English, ed. Ursula Lenker and Anneli Meurman-Solin (Current Issues in Linguistic Theory, 283). Amsterdam: Benjamins, 255-287.
Meurman-Solin, Anneli 2007b. 'Annotating variational space over time'. In: Annotating variation and change, ed. Anneli Meurman-Solin and Arja Nurmi. eVARIENG Series, Vol. 1.
http://www.helsinki.fi/varieng/journal/index.html
Meurman-Solin, Anneli and Arja Nurmi 2004. 'Circumstantial Adverbials and Stylistic Literacy in the Evolution of Epistolary Discourse'. In: Language Variation in Europe. Papers from ICLaVE 2, ed. Britt-Louise Gunnarsson, Lena Bergström, Gerd Eklund, Staffan Fridell, Lise H. Hansen, Angela Karstadt, Bengt Nordberg, Eva Sundgren and Mats Thelander. Uppsala: Universitetstryckeriet, 302-314.
Meurman-Solin, Anneli and Keith Williamson 2004. 'Tagging for the shape of a system: the case of relative pronouns'. A paper presented at the 25 th Conference of the International Computer Archive of Modern and Medieval English, University of Verona.
Nevala, Minna 2004. Address in Early English Correspondence. Its Forms and Socio-Pragmatic Functions. (Mémoires de la Société Néophilologique de Helsinki, 64). Helsinki: Société Néophilologique.
Quirk, Randdolf, Sidney Greenbaum, Geoffrey Leech and Jan Svartvik 1985. A Comprehensive Grammar of the English Language. London: Longman.
Rissanen, Matti 1989. 'The Conjunction for in Early Modern English'. Nowele 14: 3-18.
Rissanen, Matti 1999. 'Syntax'. In: The Cambridge History of the English Language. Vol. III: 1476-1776, ed. Roger Lass. Cambridge: Cambridge University Press, 187-331.
Williamson, Keith 1992/93. 'A Computer-aided Method for Making a Linguistic Atlas of Older Scots'. Scottish Language 11-12: 138-173.
Williamson, Keith 2000. 'Changing Spaces: Linguistic Relationships and the Dialect Continuum'. In: Placing Middle English in Context, ed. Irma Taavitsainen, Terttu Nevalainen, Päivi Pahta and Matti Rissanen. Berlin: Mouton de Gruyter, 141–179.
Williamson, Keith 2001. 'Spatio-Temporal Aspects of Older Scots Texts'. In: Scottish Language 20: 1-19.
Williamson, Keith 2004. 'On Chronicity and Space(s) in Historical Dialectology'. In: Methods and Data in English Historical Dialectology, ed. Marina Dossena and Roger Lass. Bern: Peter Lang, 97-136.
Williamson, Keith 2005. 'DOST and LAOS: a Caledonian symbiosis?'. In: Perspectives on the Older Scottish Tongue. A Celebration of DOST, ed. Christian J. Kay and Margaret A. Mackay. Edinburgh: Edinburgh University Press, 179-198.
|