A variational pragmatic approach to reformulation markers in English and Hungarian

It is well known that the variationist paradigm was originally developed for the analysis of the social stratification of phonological features, and the methodology was later extended for the study of morpho-syntactic and lexical features. Variationist studies of discourse-pragmatic features are even more recent. Moreover, as Pichler notes, studies of phonological and morpho-syntactic variation and change have been “relatively homogeneous and congruent in focus and methodology” (2010: 582), while there is remarkable heterogenity in the study of discourse-pragmatic variation due to the “lack of a coherent set of methodological principles” (ibid.).

The present paper will provide a combination of inductive and deductive, quantitative as well as qualitative analyses of Hungarian reformulation markers (RMs) across a variety of speech situations and genres. The study will map the functional spectrum of one thousand randomly selected tokens of the RMs vagyis (raw frequency=27,722), azaz (RF=27,824) and mármint (RF=2,770) in the 181-million-word Hungarian National Corpus (MNSZ) in three registers (literary, political and private discourse) across five regional varieties of Hungarian (those spoken in Hungary, Slovakia, Subcarpathia, Transylvania and Vojvodina) and compare the results with previous research into the use of the most frequent English translation equivalents I mean, that is, and or (rather) (cf. e.g. del Saz Rubio: 2003; Cuenca: 2003).

The individual tokens have been tagged for the following features: (1) collocations and co-occurrence patterns (including discourse marker clusters), (2) speech act / function / rhetorical role of the host utterance and the preceding utterance, (3) RMs’ position in the utterance, and (4) focus of DM (narrow [NP/VP], and broad focus).

The variation coefficient (CV%) and Juilland’s D dispersion values are surprising: it is the more frequent and intuitively more informal RMs vagyis and azaz that show a relatively even distribution across genres and are less frequent in informal genres (CV%vagyis=21.44%; Juilland's Dvagyis =0.79; relative frequency in private discourse vagyis =68%; CV%azaz=23.97%; Juilland's Dazaz=0.76; RF in PDazaz=101%), while the intuitively more formal RM mármint is more unevenly distributed (CV%=41.74%; Juilland's D=0.58) and is more frequent in private discourse (RF in PDmármint =321%).

The paper will argue that the dispersion values can be explained with reference to the correlations between the functional features and socio-pragmatic parameters that have been annotated, and they reflect medium-specificity, degree of planning and institutional norms, and to a lesser degree regional variation.


Cuenca, Maria-Josep. 2003. Two ways to reformulate: a contrastive analysis of reformulation markersJournal of Pragmatics 35: 1069-1093.

Del Saz Rubio, Milagros. 2003. An analysis of English Discourse Markers of Reformulation. Universitat de València: Servei de Publicacions de la Universitat de València.

Pichler, Heike. 2010. Methods in discourse variation analysis: Reflections on the way forward. Journal of Sociolinguistics 14/5: 581-608.

Social meanings of discourse markers and disfluent speech

Research on the evaluation of linguistic variants of which one is a standard and the other a non-standard feature, e.g. –ing and –in as in "singing", find evaluative differences to relate to perceived prestige, solidarity and dynamism (e.g. Campbell-Kibler 2011). This study aims to find out whether these dimensions also surface with conversational variants to which the standard/non-standard dichotomy may not apply, e.g. unfilled pauses and, to a lesser degree, the discourse marker “you know”. Similar findings could be seen as evidence for more general evaluative cognitive mechanisms, rather than a specific sociolinguistic one, such as the sociolinguistic monitor (Labov et al. 2011).

This study explores these questions by conducting perceptual tests with several guises: speech without noticeable pauses and discourse markers, and the same speech with three 300ms unfilled pauses and / or discourse particles “like” (D’Arcy 2007) or “you know” (in a specific function, Holmes 1986) inserted. Stimuli were prepared for two female, mid-20s speakers and three topics, resulting in a total of 36 stimuli. Data were collected in England in 2016. 668 respondents rated three stimuli each on scales such as intelligence, casualness, etc. in a between-subjects design.

Respondent ratings were subjected to mixed effects linear regressions. Excerpts with unfilled pauses are heard as more prestigious (more educated and posh) but less dynamic (less confident, certain, etc.) than the neutral guise. “You know” is heard as less prestigious and less dynamic than the neutral guise. In guises where unfilled pauses precede “you know” or “like”, new social meanings emerge. This study concludes that those mechanisms responsible for the evaluation of standard and non-standard speech also seem to apply to conversational features. It further provides support for a theory of indexicality that is not meaning-additive but meaning-interactive as new meanings emerge when different features are combined.


Campbell-Kibler, Kathryn. 2011. The sociolinguistic variant as a carrier of social meaning. Language Variation and Change 22, 423-41.

D'Arcy, Alexandra. 2007. Like and language ideology: disentangling fact from fiction. American Speech 82, 386-419.

Holmes, Janet. 1986. Functions of you know in women’s and men’s speech. Language in Society 15, 1-22.

Labov, William, Ash, Sharon, Ravindranath, Maya, Weldon, Tracey, Baranowski, Maciej & Nagy, Naomi. 2011. Properties of the sociolinguistic monitor. Journal of Sociolinguistics 15, 431-63.

The discourse-pragmatic marker ‘you know’ in two native and two non-native varieties of English

Recent studies suggest that the same discourse-pragmatic marker [DPM] may be used differently across varieties of English (Kallen 2005, Siemund et al. 2009), or differently by non-native (NS) and native speakers (NNS) (Diskin 2017, Müller 2005). A DPM may also vary as regards its position within the turn (Heritage 2015). ‘You know’ fulfils a variety of functions, including requesting acknowledgement (Schourup 1985) or reassurance (Holmes 1986); appealing to shared knowledge and achieving intimacy (Östman 1981, Schiffrin 1987); and introducing consequence, background information or clarification (Erman 1987). It has been found to be more frequent among NS as compared to NNS (Fung and Carter 2007), that its use increases with proficiency and acculturation (Hellermann and Vergun 2007), and that NNS favour its ‘coherence’ over its intersubjective functions (House 2009).

This paper presents a quantitative analysis of the frequency, function and position of ‘you know’ in two NS (Irish and Australian English) and two NNS varieties of English (Polish and Chinese migrants in Ireland). The data originates from two corpora of sociolinguistic interviews with 53 individuals collected by the author in Dublin and Melbourne. Using fixed effects regression models, results show that ‘you know’ is employed significantly more frequently among the Polish group as compared to both the Chinese and the NS, with no effect found for proficiency. The Poles were more likely to use ‘you know’ in turn-medial position, to focus or illustrate, whereas the Irish group used ‘you know’ significantly more in an interpersonal turn-final position, often eliciting a (minimal) response from their interlocutor. No differences were found in rates of use between the Irish and the Australians, but, as alternatives to ‘you know’, the Australians were found to employ other turn-initial markers such as ‘I mean’, or ‘look’, whereas the Irish had greater uses of ‘well’ in similar instances.


Diskin, C. 2017. The use of the discourse-pragmatic marker ‘like’ by native and non-native speakers of English in Ireland. Journal of Pragmatics, 120, 144–157.

Erman, B. 1987. Pragmatic Expressions in English, Stockholm, Almqvist & Wiksell.

Fung, L. and Carter, R. 2007. Discourse Markers and Spoken English: Native and Learner Use in Pedagogic Settings. Applied Linguistics, 28(3), 410–439.

Hellermann, J. & Vergun, A. 2007. Language Which Is Not Taught: The Discourse Marker Use of Beginning Adult Learners of English. Journal of Pragmatics, 39(1), 157–179.

Heritage, J. 2015. Well-prefaced turns in English conversation: A conversation analytic perspective. Journal of Pragmatics 88, 88–104.

House, J. 2009. Subjectivity in English as Lingua Franca Discourse: The Case of You Know. Intercultural Pragmatics, 6(2), 171–193.

Kallen, J. 2005. Silence and Mitigation in Irish English Discourse. In: Barron, A. & Schneider, K. P. (eds.) The Pragmatics of Irish English. Berlin: Mouton de Gruyter, 47–71.

Müller, S. 2005. Discourse Markers in Native and Non-Native English Discourse, Amsterdam, John Benjamins.

Östman, J.-O. 1981. You Know: A Discourse Functional Approach, Amsterdam, John Benjamins.

Schiffrin, D. 1987. Discourse Markers, Cambridge, Cambridge University Press.

Siemund, P., Maier, G. and Schweinberger, M. 2009. Towards a More Fine-Grained Analysis of the Areal Distributions of Non-Standard Features of English. In: Pentilla, E. & Paulasto, H. (eds.) Language Contacts Meet English Dialects: Studies in Honour of Markku Filppula. Newcastle upon Tyne: Cambridge Scholars Publishing, 19–46.

Schourup, L. 1985. Common Discourse Particles in English Conversation, New York, Garland.

Um, about that, uh, variable: uh and um in teen instant messaging

Recent variationist studies of filled pauses in English have shown that uh is declining in favour of um in apparent time (Fruehwald, 2016; Wieling et al., 2016). Wieling et al. (2016) suggest that um may be taking on a new discourse function in English, leading to an increase in its frequency. This study adds to these findings with data from a corpus of young-adult Toronto instant messaging (IM) (Tagliamonte, 2003–2006, 2007–2010, 2016; Tagliamonte & Denis, 2008). IM is a written format, meaning that using um/uh requires conscious effort (Tottie, 2017; Wieling et al., 2016). This means that in IM, um/uh are being used as discourse markers, not unconscious ‘hesitation markers’.

While each individual primarily uses one of the two variants, possibly as part of establishing a consistent personal style (as with (ing) in Dinkin, 2014 and u vs. you in Tagliamonte & Denis, 2008), most speakers use both variants, and intraspeaker choice is conditioned by position in message: initial position favours um. Um is primarily used to introduce propositions, indicating confusion, uncertainty, disagreement, and/or discomfort (1-a), while uh is primarily used mid-message or message-finally, indicating overt lexical access (1-b) or hesitation (1-c).

(1)       a.       um well i sorta already told allie i would do something with her cuz

                      she is coming home that day

            b.       ok, i am trying to play that game. . . uh Hearts. . . right

            c.       thats. . . kinda. . . uh. . .

Contrary to recent work indicating that female speakers favour um (Fruehwald, 2016; Wieling et al., 2016), there is no direct gender effect in this data. However, a significant interaction between gender and syntactic position—women favour um message-initially more than men—may indicate a gender difference in terms of discourse function. Taken together, these results suggest that um and uh have divergent discourse functions online, potentially reflecting emergent differences in the spoken language.


Dinkin, A. J. (2014). A phonological variable in a textual medium: (ing) in online chat. Presented at CVC 8, Kingston, ON.

Fruehwald, J. (2016). Filled pause choice as a sociolinguistic variable. University of Pennsylvania Working Papers in Linguistics, 22(2), 6.

Tagliamonte, S. A. (2003–2006). Linguistic Changes in Canada entering the 21st century. Research Grant. Social Sciences and Humanities Research Council of Canada (SSHRC). #410-2003-0005.

Tagliamonte, S. A. (2007–2010). Directions of change in Canadian English. Research Grant. Social Sciences and Humanities Research Council of Canada (SSHRC). #410 070 048.

Tagliamonte, S. A. (2016). So sick or so cool? The language of youth on the internet. Language in Society, 45(1), 1–32.

Tagliamonte, S. A. & Denis, D. (2008). Linguistic ruin? LOL! Instant messaging and teen language. American speech, 83(1), 3–34.

Tottie, G. (2017). From pause to word: uh, um and er in written American English. English Language & Linguistics, 1–26.

Wieling, M., Grieve, J., Bouma, G., Fruehwald, J., Coleman, J., & Liberman, M. (2016). Variation and change in the use of hesitation markers in Germanic languages. Language Dynamics and Change, 6(2), 199–234.

Before the rise of um

One of the most dramatic discourse-pragmatic changes in twentieth century English has progressed under the radar of laypeople and (until recently) linguists: the rise of um as the predominant variant of the ‘filled pause’ variable (UHM) at the expense of uh (Tottie 2011, Fruehwald 2016, Wieling et al. 2016). Fruehwald (2016:43) documents this “textbook” change over 100+ years of apparent time: um increases incrementally between generations and the rise is led by women. In this paper, we investigate UHM, as in (1), at an early stage of change to determine what triggered the rise of um.

(1)     Uh as a rule they harrowed it before they um drilled it.                                  (M/1899)

We utilize the variationist method to examine UHM in the Farm Work and Farm Life Since 1890 corpus of oral histories (recorded in 1984 with elderly farmers in two regions of Ontario, Canada) (Denis 2016). The 24 interviewees were born between 1890 and 1920. For apparent-time contrast, we also consider the two interviewers (community-insiders and university students). Nearly 5000 tokens were extracted and coded for speaker birth year, gender, region and utterance position. Following Tottie (2017), utterance position may correlate with a discourse-functional contrast. We consider the possibility that functional expansion may have triggered the change (cf. Wieling et al. 2016:228).

The overall frequency of um among the farmers is 11%. We find no significant effect of gender (12% for women, 10% for men). In one region, there is an effect of birth year. In contrast, the interviewers’ frequencies are much higher and gender-differentiated. Lastly, we find no effect of utterance position. Our results indicate that this data covers the first stage of this change. At this early stage, we find is no functional difference between the forms suggesting that functional expansion did not trigger the rise of um.


Denis, D. (2016). Oral histories as a window to sociolinguistic history and language history: Exploring earlier Ontario English with the Farm Work and Farm Life Since 1890 oral history collection. American Speech 91(4): 513–516.

Fruehwald, J. (2016). Filled pause choice as a sociolinguistic variable. University of Pennsylvania Working Papers in Linguistics, 22(2), 6.

Tottie, G. (2017). From pause to word: uh, um and er in written American English. English Language & Linguistics, 1–26.

Wieling, M., Grieve, J., Bouma, G., Fruehwald, J., Coleman, J., & Liberman, M. (2016). Variation and change in the use of hesitation markers in Germanic languages. Language Dynamics and Change, 6(2), 199–234.

Variation and change among pragmatic markers as planners in American English

Uh and um, henceforth UHM, are the archetypal planning devices that speakers use when producing online speech, but the best-known and most frequently studied pragmatic markers well, you know, I mean and like (WYIL) have also been shown to be used as planners as their original meaning has been bleached. These pragmatic markers have either been studied individually over time, or presented together as a group at one particular point in time (e.g. Beeching 2016,  Tagliamonte 2016), but to my knowledge, there have been no attempts at showing their development as a “functional field” over time. The aim of this paper is to present a diachronic overview of UHM and WYIL and to consider them as a possible functional field.

The background is that UHM has been shown to be used most frequently by older speakers (e.g. Tottie 2001, Laserna et al. 2014, Rousier-Vercruyssen 2017); the cause has usually been assumed to be the decline of cognitive functions in older speakers. I wish to discuss the possibility that there are also other factors at work and that speakers in different age groups prefer different markers as planners. The data presented in Table 1, based on c. 170,000 words from the Santa Barbara Corpus of Spoken American English (SBC), suggest that this may be the case. Note that that there is a marked decline in the use of UHM over time: the oldest speakers use much more of these than younger speakers, whereas you know is more frequent in young and medium age groups. As expected, like is only frequent in the youngest group. The data obviously have to be used with great circumspection, as WYIL retain much of their original meanings, but I shall show that their planning function is often clear from context.

Table 1. The distribution of uh, um, you know, I mean and like in impromptu conversation in American English in the SBC subsample.

15-24 16% 17% 27% 11% 29% 1383
25-34 43% 15% 28% 8% 7% 1454
35-44 44% 22% 17% 8% 9% 603
45-59 42% 23% 25% 9% 1% 1049
60+ 58% 23% 16% 3% <1% 909


Discursive like across apparent time in Australian English

Although discursive uses of like have been examined in many varieties, they have received little attention in Australian English (AusE). Thus far only Miller (2009) has studied like in these contexts within AusE, finding that in spoken corpora (ICE‐AUS and the Australian Radio Talkback corpus), like is the fourth most frequent discourse marker. Miller’s examination of 122 instances showed equal uses of clause initial and medial like (marker and particle, respectively, in D’Arcy’s [2017] terminology) which contrasts to higher medial use found in other varieties of English (Siemund, Maier, & Schweinberger, 2009).

To investigate the current variation and change in the use of discursive like in AusE, this paper provides a sociolinguistic view on its use through interview data. The study is based on close to 3080 tokens of like from 87 speakers across four participant age‐groups: adolescent, young adult, middle‐aged and older. The position of like (initial/medial/final) is explored and quantified for each age group and by speaker sex. In addition to showing patterning in relation to these variables, the analysis in apparent time allows assertions regarding ongoing language change in AusE.

The findings are in line with previous studies in that the use of initial and medial like are increasing in apparent time. All four age‐groups present higher initial use but accompanied by age‐based variation. In terms of speaker sex, however, the analysis does not support the common assumption of higher use amongst young females.

By including the examination of sociolinguistic factors, this study expands the currently very limited knowledge of the use of like as a discourse marker in AusE and contributes to the small number of quantitative papers on discourse‐pragmatic features in this variety of English. It further allows for comparisons with interview data in other varieties of English.


D'Arcy, A. (2017). Discourse‐pragmatic variation in context: Eight hundred years of LIKE Amsterdam; Philadelphia: John Benjamins.

Miller, J. (2009). Like and other discourse markers. In P. C. Collins, P. Peters & A. Smith (Eds.), Comparative studies in Australian and New Zealand English (pp. 317–337). Amsterdam; Philadelphia: John Benjamins.

Siemund, P., Maier, G., & Schweinberger, M. (2009). Towards a more fine‐grained analysis of the areal distributions of non‐standard features of English. In H. Paulasto & E. Penttilä (Eds.), Language contacts meet English dialects: Studies in honor of Markku Filppula (pp. 19–46). Cambridge: Cambridge Scholars Publishing.

Toi and tota – from a pronoun to a particle

This paper examines the planning expressions toi (tua, tuo) and tota (tuata, tuota), ‘well, ehm’, originating from a demonstrative pronoun corresponding to ‘that’, in spoken Finnish. It is not an uncommon phenomenon that a pronoun is used as a “filler” in spontaneous speech when a speaker encounters trouble formulating an utterance. Demonstrative pronouns have this function in many languages (Hayashi & Yoon 2006). The aim of this paper is to clarify the process of a pronoun turning into a particle by examining the case of toi in Finnish. The paper studies the use of toi and tota in everyday conversation, in the morphosyntactically coded Arkisyn-database.

Hayashi and Yoon (2006) describe three distinct usage types of demonstratives in the context of word-formulation trouble. Two of the types, the placeholder use and the interjective hesitator use, are relevant when toi is concerned. Placeholders are referential pronouns and take part of the syntactic structure of the utterance, that is, the forms used correspond syntactically and semantically to the word, the place of which the pronoun is holding. Contrasting to this, hesitators are non-referential, have no role as a clausal constituent, and have little correspondence to the word the speaker is searching.

This paper shows that, in Finnish, there is a continuum from a genuine demonstrative pronoun to a particle. That is, the reference turns more and more “open” and unclear until it is lost completely. (For “openness” of tota, see Etelämäki & Jaakola 2009.) In the referential end of the continuum there are placeholder pronouns, often in the nominative case (toi) and used in word searches and other purposes. In the non-referential end there are genuine particles, usually in the form of the partitive case (tota). In between there are occurrences in either partitive or nominative case, with some, but often vague, reference.


Hayashi, M & Yoon, K. 2006: A cross-linguistic exploration of demonstratives in interaction. With particular reference to the context of word-formulation trouble. Studies in Language 30:3, 485–540.

Etelämäki, M. & Jaakola, M 2009: Tota ja puhetilanteen todellisuus. [’Tota and the reality of speech situation.’] Virittäjä 113, 188–212.

The Thai Pragmatic Particle di: Corpus Analysis and the Use of di Compared to si by Native Thai Speakers

Final particles are areal features in languages spoken in Southeast Asia and some countries in East Asia (Goddard 2005). Thai pragmatic particles or final particles are generally used in a spoken language to express feelings or attitudes of speakers, intimacy between interlocutors, politeness, and social status. Moreover, they are important in terms of making conversations sound smooth and natural. One of the well-known particles in Thai is si which is normally found in commands or confirmative utterances. According to Maklai (2015), the communicative functions of si are to increase an authority in an utterance, to make a firm utterance, to show no interest of speakers, and to mark a topic of an utterance with a contradictory tone. Recently, the use of pragmatic particle di instead of si has increased among many native Thai speakers especially, the young. This study aims to investigate whether the pragmatic particle di is the variation of si, and to explore the use of di compared to si by native Thai speakers. The study consists of two parts. The first part involves an analysis of communicative functions of di using data from the Thai National Corpus (TNC) compared to the communicative functions of si from Maklai (2015). The results show that in general, the pragmatic particle di has similar communicative functions as si, and it can be concluded that di is one of the variations of si. Another part involves the use of di from an online survey by two different age groups of native Thai speakers; 40 participants of 20-30-year-old, and 40 participants of 50-60-year-old.The findings show that the group of 20-30-year-old participants generally uses the pragmatic particle di more than si significantly whereas the group of 50-60-year-old participants rarely uses di. It can be seen that one important factor among many factors in the use of the variation of si is the age of speakers.


Goddard, Cliff. 2005. The Language of East and Southeast Asia: An Introduction. Oxford: Oxford University Press.

Maklai, Sumintra. 2015. The Acquisition of the Thai Final Particles na and si by Learners of Thai as a Second Language(การรับอนุภาคลงท้าย "นะ" และ "สิ" ของผู้เรียนภาษาไทยเป็นภาษาที่สอง).Doctoral Dissertation, Chulalongkorn University, Bangkok. (In Thai)

Expressive pseudo-masculine particles in the history of American English: A corpus-based account

Expressive particles are a type of pragmatic particle that are used as interjections to convey emotional stance or express an attitude toward the interlocutor while adding no substantial new information to the propositional content (see Aijmer 1996, McGready 2009). One particular subtype of expressive particles are pseudo-masculine particles: nouns such as man or buddy that express a sense of companionship or common ground, but may also indicate condescension or a sense of excitement or exasperation. Although these particles have been discussed in literature before (Hill 1994, Kiesling 2004, Siegel 2005, McGready 2006 and 2009, Rendle-Short 2010, Nousiainen 2014), no previous study has traced their use over a long timeline applying detailed grammatical and pragmatic analysis to evidence from large corpora.

In this paper, I will discuss the diachronic development and distribution of the particles boybuddy, dude, man and mate in American English. Using the 400-million-word Corpus of Historical American English (COHA) as primary data and operationalising the particles as single tokens separated by punctuation, the c. 12,500 observations were analysed using the following variables: sentence type, placement within the sentence, tense, polarity, appositional function, direct interlocutor address or absence thereof, the presence of modal verbs and their type, the semantic class of the lexical verb, and the pragmatic function the particle. This matrix is analysed using multifactorial non-parametric regression, in the present study recursive partioning (see Strobl et al 2009) to identify significant and substantial trends over time. As will be shown, the use of the particles remained essentially stable at a low frequency until the 1940, after which both their use and lexical diversity increased rapidly, hitting a peak in the 1970s and turning into decline from there.


Aijmer, Karin. 1996. English Discourse Particles. Evidence from a corpus. Amsterdam: John Benjamins.

Davies, Mark. 2010-. The Corpus of Historical American English (COHA): 400 million words, 1810-2009. Available online at https://corpus.byu.edu/coha/.

Hill, Richard. 1994. You’ve Come a Long Way, Dude—A History. American Speech 69. 321–27.

Kiesling, Scott F. 2004. Dude. American Speech, Vol. 79, No. 3, Fall 2004. 281–305.

McCready, Eric. 2006. English sentence-initial man. In C. Ebert & C. Endriss (eds.), Proceedings of Sinn und Bedeutung 10, Vol. 44 of ZASPIL–ZAS Papers in Linguistics. 211–223.

McCready, Eric. 2009. What man does. Linguistics and Philosophy, 31. 671–724.

Nousiainen, Maija. 2014. ‘Why does it always have to be dudes, dude?’: A corpus-based study on dude as an address term in web-based World Englishes. Master’s thesis. University of Tampere

Rendle-Short, Johanna. 2010. ‘Mate’ as a term of address in ordinary interaction. Journal of Pragmatics 42: 1201-1218.

Siegel, Muffy. 2005. Dude, Katie! Your dress is so cute: Why dude became an exclamation. Verbatim, The Language Quarterly 30, 4: 15-18.

Strobl, Carolin, James Malley & Gerhard Tutz. 2009. An introduction to recursive partitioning: Rationale, application and characteristics of classification and regression trees, bagging and random forests. Psychological Methods 14(4): 323–348.

The role of (historical) pragmatics in the uses of response particles. The case of French

Many languages use particles as minimal affirmative vs negative responses to a preceding utterance by a different speaker. Typologically, response particles function according to two basic systems, a polarity-based one and a (dis)agreement-based one.

The French system is often thought of as polarity-based, oui (‘yes’) and non (‘no’) marking the positive vs negative polarity of the response. However, it is in fact a hybrid system, integrating elements of (dis)agreement. Saliently, French has a second affirmative particle si, which marks reversal of the negative polarity of, and thus disagreement with, the utterance it responds to, cf. (1):

(1)   A : Jean ne viendra pas.
       B : Si(, il viendra)./Non(, il ne viendra pas).

       ‘A: Jean won’t come.
        B: Yes(, he will)./No(, he won’t).’

Moreover, oui is often preferred to si or non when responding with agreement to syntactically negative utterances that are positively oriented at the pragmatic level, as seen in (2):

(2)   A: N’êtes-vous pas la fille de X ?
       B : Oui/Si.

       ‘A: Aren’t you X’s daughter?
       B: Yes.’

I argue that a better understanding of the current system can be obtained by taking historical pragmatics into account. The French response particles result from lexicalization of two different constructions in Medieval French, oui < oïl < o il < Latin HOC ILLE (FECIT) (‘this he/it (did)’) vs si (< Latin SIC ‘thus’)/non + V. Medieval French had a second negative marker, viz. nenni, whose source construction nenil < nen il (‘not he/it’) is analogous to that of oui. I show quantitatively that the two pairs of response markers (oïl/nenil vs non/si) originally occurred in distinct types of contexts and had different types of pragmatic import. This remains true of oui/si, whereas in the case of the negative markers, non gradually encroached upon the territory of nenni, eventually ousting the latter.

Selected references

Høybye, Poul. 1939. Oui, si et non. Le français moderne 7: 47-51.

Kerbrat-Orecchioni, Catherine. 2001. Oui, non, si : un trio célèbre et méconnu. Marges linguistiques 2, pp. 95-119 - http://www.marges-linguistiques.com

König, Ekkehard & Peter Siemund. 2007. Speech act distinctions in grammar. In Timothy Shopen, ed. Language Typology and Syntactic Description, vol. I: Clause Structure. 2nd ed. Cambridge: CUP.

Marchello-Nizia, Christiane. 1985. Dire le vrai. L’adverbe si en français médiéval. Geneva: Droz.

Plantin, Christian. 1982. Oui et non sont-ils des « pro-phrases » ? Le français moderne 50, 252-265.

Pohl, Jacques. 1976. Matériaux pour l’histoire du système oui-non-si. Kwartalnik neofilologiiczny 23, 197-208.

Wilmet, Marc. 1976. Oui, si et non en français moderne. Le français moderne 44(3): 229-251.

On the diachrony of giusto? (right?) in Italian: A new discoursivization?

In Italian, the adjective giusto ‘right’ has performed the discourse function of response marker since at least 1613 (DELI, 2008: 671). This study shows that in the last forty years, the adjective has undertaken a new process of discoursivization, defined as the diachronic process that ends in discourse (Ocampo, 2006: 317). In particular, it investigates giusto as serving the function of invariant tag (Andersen 2001), a linguistic item appended to a statement for the purpose of seeking information, verification or corroboration of a claim (Millar & Brown 1979). Through lexicographic, quantitative and qualitative analyses carried out over a range of Italian historical (1729 - 1950) and contemporary dictionaries and written and spoken corpora (1200 - 2017), evidence of records of first appearance, frequency of occurrence, diachronic trends, and contexts of use of giusto? are retrieved. The results reveal that, although the use of giusto? as an invariant tag has not been documented by contemporary Italian lexicography yet, records of such a use are in fact found since 1980. Moreover, its high frequency of occurrence in the corpora suggests that giusto? represents a case of discoursivization. Finally, by analysing the distribution of the studied constructions in a corpus of dubbed Italian from (American) English, the study also explores the possibility that language contact with English, mainly via dubbing translations, may have played a concurrent fundamental role for such a change.


Cortelazzo, M. & Zolli, P. (eds.). 2008. Il nuovo etimológico: Dizionario etimologico della lingua italiana. Con CD-ROM. (DELI). Bologna: Zanichelli.

Ocampo, F. 2006. “Movement Towards Discourse is Not Grammaticalization: The Evolution of claro from Adjective to Discourse Particle in Spoken Spanish”. In Nuria Sagarra and Almeida Jacqueline Toribio (eds), Selected Proceedings of the 9th Hispanic Linguistics Symposium, 308–19. Somerville, MA: Cascadilla Proceedings Project.

Millar, M. and Brown, K. 1979. "Tag questions in Edinburgh speech". Linguistische Berichte 60, 24-45.

Variability of German Question Tags

In this work we analyze German question tags across different media channels. The large inventory of German question tags leads to great dialectal and pragmatic variability (ex. (1)). While tag questions (TQs) are characteristic of conversational speech, they also appear in scripted (The OpenSubtitles[1] corpus, (Lison and Tiedemann, 2016)), and written conversations (Twitter, (Scheffler, 2014)). We address two interrelated questions: (i) Which aspect of conceptual orality (Koch and Oesterreicher, 1985) in media channels facilitates the use of TQs? (ii) What do technological and social settings of the channels say about the pragmatics of the individual question tags?

  1. Du     musst             nicht              in                   die                 Schule,          {ne, oder, wa, ja, …}?
    You   must              not                 in                   the                 school,                                right?
    ‘You don't have to go to school, do you?’

We examine the occurrence of TQs in German Twitter, telephone (Karins et al., 1997) and scripted conversations through quantitative methods. We find that TQs are most frequent in telephone speech, although they also feature prominently in the other corpora (Figure 1). This indicates that TQs are an important method for establishing and maintaining common ground in conversations, whether spoken, scripted, or written. We further analyze the pragmatic context of a sample of question tags in the three corpora, including e.g. their co-occurrence with modal particles and the speaker's certainty (annotated based on examples in context). The usage pattern reveals significant differences regarding specific question tags across corpora. Overall, TQs frequently occur in all studied corpora, which points to the fact that they are licensed by interactive conversations rather than the spoken mode. 


Karins, Krisjanis, et al. 1997. CALLHOME German Transcripts LDC97T15. Web Download. Philadelphia: Linguistic Data Consortium.

Koch, Peter, and Wulf Oesterreicher. 1985. Sprache der Nähe-Sprache der Distanz. Romanistisches Jahrbuch 36:15-43.

Lison, Pierre, and Jörg Tiedemann. 2016. Opensubtitles2016: Extracting large parallel corpora from movie and TV subtitles. In Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC).

Scheffler, Tatjana. 2014. A German Twitter snapshot. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC), 2284-2289. Reykjavik, Iceland.

[1]  http://www.opensubtitles.org/

Figure 1: Number of occurrences of German tags in the different types of media per 1 Mio. tokens.

Figure 1: Number of occurrences of German tags in the different types of media per 1 Mio. tokens.

Inter-speaker accommodation on backchannels in narratives

Backchannels like “yeah”, “mhm”, “aye”, or “right” have been shown to vary based on language (Clancy, Thompson, Suzuki, & Tao, 1996; White, 1989), variety (Cathcart, Carletta, & Klein, 2003; O’Keeffe & Adolphs, 2008), and speaker gender (Bilous & Krauss, 1988; Fellegy, 1995; Kogure, 2003; Reid, 1995). Speakers have been shown to accommodate (Giles, Coupland, & Coupland, 1991) to their interlocutors’ backchannel frequency both across (Ike & Moulder, 2017; White, 1989) and within languages (Schweitzer & Lewandowski, 2012).

Interactional and pragmatic studies demonstrate that backchannels vary based on function and sequential placement (Bavelas, Coates, & Johnson, 2000; Gardner, 1998; Goodwin, 1986; Guthrie, 1997; Norrick, 2012; Schegloff, 1982) as well as position in a story (Guardiola, Bertrand, Espesser, & Rauzy, 2012). However, these linguistic factors are not generally included in analyses of backchannel accommodation.

The present study addresses this gap by analysing inter-speaker accommodation on backchannel production in a set of six dyadic conversations between four Scottish participants (each participant takes turns talking to one of the other three). I focus on one interactional context, stories (Labov & Waletzky, 1966), and examine backchannel type and (normalised) frequency with respect to sequential placement, interactional function, and position in the narrative.

Initial results show firstly that there is variability in how often individual speakers produce backchannels in response to stories, and secondly point at an interlocutor effect, with speakers deviating from their mean backchannel frequency and moving towards their interlocutor’s backchannel frequency.

The difference between speakers’ aggregate backchannel behaviour might be indicative of backchannel use being socially stratified along axes of age, gender, and social class, which can be explored further in the full dataset. This preliminary analysis will then be combined with a qualitative analysis of the narratives and the responses to them, for example by comparing the same story being narrated to and received by two different interlocutors.


Bavelas, J. B., Coates, L., & Johnson, T. (2000). Listeners as co-narrators. Journal of Personality and Social Psychology, 79(6), 941–952. https://doi.org/10.1037/0022-3514.79.6.941

Bilous, F. R., & Krauss, R. M. (1988). Dominance and accommodation in the conversational behaviours of same- and mixed-gender dyads. Language & Communication, 8(3–4), 183–194. https://doi.org/http://dx.doi.org/10.1016/0271-5309(88)90016-X

Cathcart, N., Carletta, J., & Klein, E. (2003). A shallow model of backchannel continuers in spoken dialogue. Proceeding EACL ’03 Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, 1, 51–58. https://doi.org/10.3115/1067807.1067816

Clancy, P. M., Thompson, S. A., Suzuki, R., & Tao, H. (1996). The conversational use of reactive tokens in English, Japanese, and Mandarin. Journal of Pragmatics, 26(3), 355–387. https://doi.org/http://dx.doi.org/10.1016/0378-2166(95)00036-4

Fellegy, A. M. (1995). Patterns and Functions of Minimal Response. American Speech, 70(2), 186–199. https://doi.org/10.2307/455815

Gardner, R. (1998). Between Speaking and Listening: The Vocalisation of Understandings. Applied Linguistics, 19(2), 204–224. https://doi.org/10.1093/applin/19.2.204

Giles, H., Coupland, N., & Coupland, J. (1991). Accommodation theory: Communication, context, and consequence. In H. Giles, N. Coupland, & J. Coupland (Eds.), Contexts of Accommodation (pp. 1–68). Cambridge University Press.

Goodwin, C. (1986). Between and Within: Alternative Treatments of Continuers and Assessment. Human Studies, 9(2), 205–217. https://doi.org/10.1007/BF00148127

Guardiola, M., Bertrand, R., Espesser, R., & Rauzy, S. (2012). Listener’s responses during storytelling in French conversation. In INTERSPEECH.

Guthrie, A. M. (1997). On the systematic deployment of Okay and Mmhmm. Pragmatics, 7(3), 397–415. https://doi.org/10.1075/prag.7.3.06gut

Ike, S., & Moulder, J. (2017). Pragmatic Accommodation in Backchannel Sequences in ELF Interactions. Zürich: Societas Linguistica Europea Annual Meeting.

Kogure, M. (2003). Gender differences in the use of backchannels: Do Japanese men and women accommodate to each other? Second Language Acquisition and Teaching. University of Arizona, ProQuest Dissertations Publishing.

Labov, W., & Waletzky, J. (1966). Narrative Analysis: Oral Versions of Personal Experience. In J. Helm (Ed.), Proceedings of the 1966 Annual Spring Meeting of the American Ethnological Society (pp. 12–43). Seattle and London: University of Washington Press. https://doi.org/10.1075/jnlh.7.02nar

Norrick, N. R. (2012). Listening practices in English conversation: The responses responses elicit. Journal of Pragmatics, 44(5), 566–576. https://doi.org/http://dx.doi.org/10.1016/j.pragma.2011.08.007

O’Keeffe, A., & Adolphs, S. (2008). Response tokens in British and Irish discourse: Corpus, context and variational pragmatics. In K. P. Schneider & A. Barron (Eds.), Variational Pragmatics (pp. 69–98). Amsterdam: John Benjamins.

Reid, J. (1995). A study of gender differences in minimal responses. Journal of Pragmatics, 24(5), 489–512. https://doi.org/http://dx.doi.org/10.1016/0378-2166(94)00066-N

Schegloff, E. A. (1982). Discourse as an interactional achievement: Some uses of “uh huh”and other things that come between sentences. Analyzing Discourse: Text and Talk, 71–93.

Schweitzer, A., & Lewandowski, N. (2012). Accommodation of backchannels in spontaneous speech. the booklet of the International Symposium on Imitation and Convergence in Speech.

White, S. (1989). Backchannels across Cultures: A Study of Americans and Japanese. Language in Society, 18(1), 59–76.

‘And it was all like weird’ – Some new uses of intensifiers in contemporary British speech

The variability and on-going innovation of intensifiers make them difficult to study. Recently we have acquired better opportunities to study on-going changes in the area of intensifiers on the basis of corpora and corpus-linguistic methods. The aim of this paper is to study the emergence of the intensifiers well, right, all (like) with adjectival heads in contemporary British speech on the basis of the new Spoken BNC2014 (Love et al. 2017) using as an earlier sampling point the ‘old’ Spoken BNC from the 1990s. The ‘new’ intensifiers well, right, all had their heyday in earlier English (cf Ito and Tagliamonte 2003, Rickford et al 2008), then lost in importance only to reemerge as innovations in the spoken language of adolescents as is evident from new spoken corpora:

          (1) ah it 's right cute this
          (2) they 're well old
          (3) I don't know but it looks all funny
          (4) it just goes all like crappy

The research questions are as follows:

- How frequent are the ‘new’ intensifiers in SpokenBNC2014 and what can we conclude about the quantitative changes they have undergone recently by making a comparison with their frequencies in the SpokenBNC1990?

-What is the relation between the intensifier and the linguistic (syntactic and semantic) context? How are the intensifiers used with different (trendy or common-place) adjectives and with positive and negative values (semantic prosody)?

- What is the relation between the frequency and use of the intensifiers and the speaker’s age, gender and social class? Does age-grading come into the picture?

Theoretically the study takes its inspiration and starting-point from the interest in the mechanisms through which intensifiers re-emerge in the spoken language of particular speakers and how these innovations interact with long-term, stable developments of the intensifiers (Macaulay 2006, Barnfield and Buchstaller 2010, D’Arcy 2015).


Barnfield, K. and I. Buchstaller. 2010. Intensification on Tyneside: Longitudinal developments and new trends. English World-Wide 31:252-287.

D’Arcy, A. 2015. Stability, stasis and change. The longue durée of intensification. Diachronica 32(4): 449-493.

Ito, R. and S. Tagliamonte. 2003. Well weird. right dodgy. very strange. really cool: Layering and recycling in English intensifiers. Language in Society 32: 257-279

Love, R., Dembry. C.. Hardie. A.. Brezina. V. and McEnery. T.forthcoming. “The Spoken BNC2014 - designing and building a spoken corpus of everyday conversations.” International Journal of Corpus Linguistics 22 (3): 319-344.

Macaulay, R. 2006. Pure grammaticalization: The development of a teenage intensifier. Language Variation and Change 18: 267-283.

Rickford, J., Wasow, T., Zwicky, A. and I. Buchstaller, 2007. Intensive and quotatative all: Something old, something new. American Speech 82(1): 3-31.

Cool system, lovely patterns, awesome results: A cross-variety comparison of adjectives of positive evaluation

Speakers in different places and different generations are known to vary markedly in their use of discourse-pragmatic phenomena, including general extenders (Cheshire, 2007; Tagliamonte & Denis, 2010), quotatives (e.g. Buchstaller, 2014), and adverbs (Aijmer, 2002; 2008). However, little research has been conducted on adjective variation (but see Tagliamonte & Brooke, 2014), despite the fact that adjectives are key resources for emotional expression (e.g. Matesic & Memisevic, 2016). Indeed, adjectives that encode positive affect add notable emotion and pragmatic nuance to vernacular discourse:

              It was fantastic. The singers were amazing and gave a great history of music.

This paper examines nearly 5000 of these adjectives in spoken community-based corpora of British and Canadian English collected in the late 1990’s and early 2001’s and employs quantitative comparative methods (Poplack & Tagliamonte, 2001) and statistical modeling to analyze the data.

The distribution of forms across generations mirrors the diachronic development contained in the OED: older forms, such as wonderful, amazing, and terrific are favoured by elderly speakers, while newer variants, such as super, fantastic, and brilliant are favoured by younger speakers. British English stands out for its use of lovely (32.3%) and recycling of an older form, great from oldest to youngest speakers (17.2% > 19.8% > 46.4%). Canadian English is distinguished by high rates of cool, (24.4%) and one of the most recent forms, awesome (5.7%), especially among speakers under 30. Mixed effects regressions expose important linguistic parallels between varieties: Older variants are favoured in attributive position, while newer variants favour predicative and ‘stand alone’ uses. Co-occurrence with intensifiers is positively correlated with speaker age. Taken together the results lead us to suggest that adjective variation is internally structured, but the forms themselves are highly sensitive to place, time and pragmatic force, opening up new possibilities for pinpointing what actuates linguistic change.


Aijmer, Karin (2002). English discourse particles, Evidence from a corpus. Amsterdam and Philadelphia: John Benjamins.

Aijmer, Karin (2008). Modal adverbs in interaction – obviously and definitely in adolescent speech. In Nevalainen, T., Taavitsainen, I., Pahta, P. & Korhonen, M. (Eds.), The Dynamics of Linguistic Variation: Corpus evidence on English past and present. 61–83.

Buchstaller, Isabelle (2014). Quotatives: New Trends and Sociolinguistics Implications. Malden and Oxford: Wiley-Blackwell.

Cheshire, Jenny (2007). Discourse variation, grammaticalisation and stuff like that. Journal of Sociolinguistics 11(2): 155-193.

Matesic, Mihaela & Memisevic, Anita (2016). Pragmatics of adjectives in academic discourse: from qualification to intensification. Jesikoslovlje 12(1-2): 179-206.

Poplack, Shana & Tagliamonte, Sali A. (2001). African American English in the diaspora: Tense and aspect. Malden: Blackwell Publishers.

Tagliamonte, Sali A. & Brooke, Julian (2014). A weird (language) tale: Variation and change in the adjectives of strangeness. American Speech 89(1): 4-41.

Tagliamonte, Sali A. & Denis, Derek (2010). The stuff of change: General extenders in Toronto, Canada. Journal of English Linguistics 38(4): 335-368.

Discourse Values as indicators of pragmaticalization in Spoken British English – a diachronic view

This paper presents a diachronic analysis of discourse values for hedges sort of and kind of. Our aim is to evaluate discourse values, defined as the “discourse function in relation to grammatical function expressed (in per cent)” (see Stenström, 1990: 161; Aijmer, 2002: 27) and their potential contribution to the quantitative assessments of patterns of language change.

Beeching, who conducted a detailed analysis of a number of pragmatic markers (including you know, I mean, like, sort of, well, and just) in spoken British English contexts, highlights the usefulness of the discourse value (or D-value) for investigations into pragmaticalization, functional pragmatic marker development, and indexicalization (2016: 78-80).

This study expands on Beeching’s work and provides a comparative study of the discourse values of hedges sort of and kind of as found in subsets of the BNC 1994, as well as the newly compiled BNC 2014. Sort of and kind of are versatile pragmatic markers that can occur in pre- and postmodification on all syntactic levels (examples 1 to 4). Their propositional function can be defined as typification of noun phrases (see example 5), following analyses of Fetzer (2010) and Brems (2010).

          (1) They're sort of nasty to them.
          (2) You know that, sort of?
          (3) It just sort of reached your earhole.
          (4) They're nice sort of wrapped up.
          (5) Asked her what sort of coins she uses.

The analysis thus focuses on the changing relative frequencies of use between pragmatic contexts and propositional contexts and whether this change is in accordance with current patterns of pragmaticalization of sort of and kind of. It concludes with general commentary on discourse values as quantitative evidence for language change.

“It's just a little weird, is all” The development and use of sentence-final is all

Sentence-final is all has received little attention in the literature. Its use is a relatively recent development since the late nineteenth century (see example (1)), mostly restricted to colloquial American English (Delin 1992; Follett 1998, s.v. all):

1.      she lands on the floor almost at her husband’s feet, and one sharp little cry is all. (1883 COHA)

Shibasaki (2016) considers is all a quotative marker similar to BE all as defined in e.g. Rickford et al. (2007), but the citations in the OED (s.v. be v., def. P2h) suggest otherwise:

2.       Expensive? Naaaw. Three hundred, is all. (1939)

3.       You didn’t see the bus, is all. (1954)

Here, is all does not appear to represent reported speech so much as to refer back to the preceding text. We agree with the OED that in this use is all implies ‘that is all there is to be said’. In our data, speakers often use it to close a topic and to distance themselves from an unwanted interpretation of the preceding utterance, as in example (2).

This paper examines the historical development and pragmatic function of sentence-final is all drawing on data from various corpora (see references). We argue that sentence-final is all derives from postponed independent or conjoined that BE all, as in examples (4) and (5). Our historical data do not support Ando’s (2005) and Fujii’s (2006) suggestion that that is all may represent a shortening of a longer construction, such as that is all I say/mean.

     (4) I would but see him, That is all. (1600 EEBO)
     (5) I am not well in health, and that is all (1623 EEBO)

We conclude that a conversational implicature arose from that is all ‘do not infer anything more’, triggering the development of reduced is all towards a discourse-pragmatic marker.


Ando, Sadao. 2005. Lectures on Modern English Grammar [Genai Eibunpo Kogi]. Tokyo: Kaitakusha.

Delin, Judy. 1992. Re: 3.174 All’s. Linguistlist. 23 February 1992. https://linguistlist.org/issues/3/3-179.html#1

Fujii, Kenzo. 2006. English in America: Its Usage and Pronunciation [Amerika no Eigo: Goho to Hatsuon]. Tokyo: Nan’un-do.

Follett, Wilson. 1998. Modern American usage. Revised by Erik Wensberg. New York: Hall & Wang.

OED = Oxford English Dictionary. 2000–. Ed. Michael Proffitt. 3rd  edn. online. Oxford: Oxford University Press. See http://www.oed.com/

Rickford, John R., Isabelle Buchstaller, Thomas Wasow, & Arnold Zwicky. 2007. Intensive and quotative all: Something old, something new. American Speech 82(1), 3-31. doi: 10.1215/00031283-2007-001

Shibasaki, Reijirou. 2016. Look, I’m just saying I’m undecided, is all: The emergence of a sentence-final quotation marker in English. Presentation at ISLE-4, Poznań, 18-21 September.


CED = A corpus of English dialogues 1560–1760. Compiled under the supervision of Merja Kytö (Uppsala University) and Jonathan Culpeper (Lancaster University). http://www.engelska.uu.se/forskning/engelska-spraket/elektroniska-resurser/a-corpus

CEN = The corpus of English novels. Compiled by Hendrik De Smit. See https://perswww.kuleuven.be/~u0044428/cen.htm

CLMET3.0 = The corpus of Late Modern English texts, version 3.0. Created by Hendrik De Smet, Hans-Jürgen Diller, and Jukka Tyrkkö. https://perswww.kuleuven.be/~u0044428/clmet3_0.htm

COCA = Davies, Mark. (2008-) The corpus of contemporary American English: 520 million words, 1990–present. Available online at https://corpus.byu.edu/coca/.

COHA = Davies, Mark. 2010–. The corpus of historical American English: 400 million words, 1810–2009. Available online at http://corpus.byu.edu/coha/

EEBO = Davies, Mark. (2017) Early English Books online. Part of the SAMUELS project. Available online at https://corpus.byu.edu/eebo/.

OBC = Magnus Huber, Magnus Nissel, Patrick Maiwald, and Bianca Widlitzki. 2012. The Old Bailey corpus. Spoken English in the 18th and 19th centuries. www.uni-giessen.de/oldbaileycorpus

SOAP = Davies, Mark. (2011-) Corpus of American soap operas: 100 million words. Available online at https://corpus.byu.edu/soap/.

General Extenders in English and Spanish among Southern Arizona Bilinguals

This study analyzes the use of general extenders in recorded conversations in English and Spanish between nine pairs of young adult Spanish-English bilingual friends from Southern Arizona. Building on previous studies of general extenders in English (Cheshire, 2007; Pichler & Levey, 2011; Wagner et al., 2014) and Spanish (Cortés, 2006; Fernández, 2015), 325 tokens of general extenders were analyzed quantitatively according to function (referential or nonreferential), length, sex, and language dominance. Linear and logistic mixed-effects models were carried out in R with random intercepts for each participant to take into account cross-individual variation.

It was expected that general extenders would be susceptible to borrowing in a language contact situation since discourse-pragmatic features often appear on the periphery of grammar and are detachable (e.g., Brody, 1995). However, in the speech of the same Spanish-English bilinguals, contact with English did not appear to influence the use of general extenders in Spanish. No English forms of general extenders were found in Spanish. Moreover, general extenders in Spanish were significantly longer and were used to fulfill referential functions more often than general extenders in English regardless of sex and language dominance. Lastly, referential general extenders were significantly longer than non-referential general extenders in both English and Spanish, mirroring the results of Wagner et al’s (2015) study of general extenders in English.

As the first study to analyze the use of general extenders in English and Spanish in the speech of the same bilinguals, these results contribute to our knowledge of the limited permeability of discourse in the speech of bilinguals. They also underline the ability of bilinguals to both understand and reproduce the subtleties of the use of these features in the two languages they speak.

Selected References

Brody, J. (1995). Lending the unborrowable: Spanish discourse markers in Indigenous American Languages. In C. Silva-Corvalán (Ed.) Spanish in four continents: Studies in language contact and bilingualism, (pp. 132-148). Washington, D.C.: Georgetown UP

Cheshire, J. (2007) Discourse variation, grammaticalisation and stuff like that. Journal of Sociolinguistics 11(2), 155-193.

Cortés, L. (2006). Los elementos de final de serie enumerativa del tipo y todo eso, o cosas así, y tal, etcétera en el discurso oral en español. Perspectiva textual. BISAL, 1, 82–106.

Fernández, J. (2015). General extender use in spoken peninsular Spanish: Metapragmatic awareness and pedagogical implications. Journal of Spanish Language Teaching, 2(1), 1-17.

Pichler, H., & Levey, S. (2011). In search of grammaticalization in synchronic dialect data: General extenders in northeast England. English Language and Linguistics, 15(3), 441-471.

Wagner, S. E., Hesson, A., Bybel, K., & Little, H. (2015). Quantifying the referential function of general extenders in North American English. Language in Society, 44(5), 705-731.

The borrowability of English swearwords in Dutch: a variationist approach

This paper presents a study on contact-induced discourse-pragmatic variation and change, quantitatively addressing the borrowability of 882 English swearwords in Dutch and qualitatively studying the way in which these pragmatic units “assume, in addition to the expression of emotional attitudes, various discourse functions” (Dewaele 2004).

Methodologically, we aim to introduce innovations to research on  swearing and borrowability by relying on set-external proof to uncover which swearwords are more prone to borrowing than others (compare van Hout & Muysken 1994, Zenner et al. 2013). Instead of focusing only on the frequency of the English swearwords that are effectively attested in the Dutch lexicon, we start off from a (near-)comprehensive list of all potentially borrowed swearwords in English. This list was created by combining input from lexicographical sources (e.g. Rawson 1989) and online lists of swearwords (compare Wang et al. 2014).

Through a quantitative variationist analysis, we then verify what determines which of these English forms are attested in Dutch and which are not. Specifically, we rely on a Twitter corpus of six million Dutch tweets published in the Low Countries: as blending areas of registers and modalities, of proximity and distance, Twitter provides us with a rich empirical basis for conducting swearword research. Three types of predictors are included to explain the attested variation in borrowability: swearword-specific parameters (e.g. offensiveness ratings; see Dewaele 2016), contact-linguistic parameters (e.g. speech economy; see Chesley & Baayen 2010), and lectal parameters (the contrast between Belgian Dutch and Netherlandic Dutch tweets; see Ruette 2018). Multiple correspondence analyses and regression trees reveal a clear impact of both swearword specific (e.g. offensiveness ratings) and more general contact linguistic (e.g. speech economy) parameters.

No significant differences are found between the Belgian Dutch and Netherlandic Dutch data. In our interpretation of the results, we qualitatively discuss specific examples that demonstrate the linguistic creativity of both groups of language users in embedding the English bad words in otherwise Dutch tweets, and support the thesis that swearwords are highly similar to discourse particles (cp. Dewaele 2005).


Chesley, Paula, and Harald Baayen. 2010. “Predicting New Words from Newer Words: Lexical Borrowings in French.” Linguistics 484: 1343–1374.

Dewaele, Jean-Marc. 2004. The Emotional Force of Swearwords and Taboo Words in the Speech of Multilinguals. Journal of Multilingual and Multicultural Development 25:2-3, 204-222.

Dewaele, Jean-Marc. 2016. “Thirty Shades of Offensiveness: L1 and LX English users’ Understanding, Perception and Self-Reported Use of Negative Emotion-Laden Words.” Journal of Pragmatics 94: 112–127.

Rawson, Hugh. 1989. Wicked Words: A Treasury of Curses, Insults, Put-Downs, and Other Formerly Unprintable Terms from Anglo-Saxon Times to the Present. New York: Crown Publishers.

Ruette, Tom. 2018. “Regional Variation in the Source Domains for Dutch Swearing.” In Linguistic Taboo Revisited. Novel Insights from Cognitive Perspectives, ed. by Andrea Pizzaro Pedraza. Berlin: Mouton de Gruyter.

Van Hout, Roeland, and Pieter Muysken. 1994. “Modeling Lexical Borrowability.” Language Variation and Change 61: 39–62.

Wang, Wenbo, Lu Chen, Thirunarayan Krishnaprasad, and Amit Sheth. 2014. “Cursing in English on Twitter.” Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing: 415–424.

Zenner, Eline, Dirk Speelman, and Dirk Geeraerts. 2013b. “What Makes a Catchphrase Catchy? Possible Determinants in the Borrowability of English Catchphrases in Dutch.” In New Perspectives on Lexical Borrowing, ed. by Eline Zenner and Gitte Kristiansen, 41–64. Berlin: Mouton de Gruyter.

Oh my god! / Herregud! What governs speakers’ choices of borrowed vs. domestic forms of discourse-pragmatic variables?

English exerts a major influence on other languages, and borrowing is a significant product of language contact. This includes the borrowing of discourse-pragmatic items such as politeness formulae, greetings, expletives, etc. (Andersen 2014; Andersen 2017; Peterson 2017; Terkourafi 2009). One intriguing issue that has received little attention in contact linguistics but which lends itself nicely to a variationist socio-pragmatic approach is the question of what motivates speaker’s choice of a borrowed form versus a domestic alternative form and how this choice is constrained by contextual factors. This paper considers English-based forms that are used as discourse-pragmatic items in Norwegian. I consider four different items, two polite expressions and two interjections, and their domestic alternatives. This includes please, used in requests alongside vær så snill, sorry used in polite excuses alongside jeg beklager/er lei (meg) for, expletives with fuck including what the fuck vs. domestic hva faen, and oh my god, used alongside herregud and similar forms. I introduce a methodology for coding each variant in terms of their illocutionary force and speech act type, with a view to exploring the pragmatic conditioning of the use of such pragmatic Anglicisms. The aim is to assess whether the forms in question can justifiably be considered variants of a discourse-pragmatic variable. Rather than seeing the English forms as replacing their domestic equivalents, I argue that we can see signs of a pragmatic division of labour due to differences in illocutionary force of the borrowed vs. domestic variants, e.g. such that Anglicisms are used in speech situations that are potentially less offensive, while domestic forms tend to be preferred where there is a greater need for face-threat mitigation. I explore four different corpora of spoken Norwegian, UNO, NoTa-Oslo, the Big Brother corpus and the Scandinavian Dialect Corpus. Since not all of these corpus data are conversational, the analysis is augmented with data from fictional dialogue drawn from a large text archive.


Andersen, Gisle. 2014. Pragmatic borrowing. Journal of Pragmatics 67: 17-33.

Andersen, Gisle. 2017. The pragmatic turn in studies of linguistic borrowing. Journal of Pragmatics 113: 71-76.

Peterson, Elizabeth. 2017. The nativization of pragmatic borrowings in remote language contact situations. Journal of Pragmatics 113: 116-126.

Terkourafi. 2009. "Thank you, Sorry and Please in Cypriot Greek: What happens to politeness markers when they are borrowed across languages?"  Journal of Pragmatics 43:218-235.


The UNO corpus: http://clu.uni.no/humfak/uno/

NoTa-Oslo: http://www.tekstlab.uio.no/nota/oslo/index.html

The Big Brother corpus: http://www.tekstlab.uio.no/nota/bigbrother/

The Scandinavian Dialect Corpus: http://www.tekstlab.uio.no/nota/scandiasyn/index.html

Bokhylla – The National Archive: https://www.nb.no/

Variation and change in real time in two French-Canadian communities

This paper examines the use of consequence markers ça fait que, so, donc and alors in two genetically-related varieties of Canadian French. The study is based on corpora collected in the 1970s and 2010s in Montréal, Québec, a majority francophone environment, and Welland, Ontario, a minority francophone environment.

Blondeau, Mougeon & Tremblay (to appear) examined the variable use of consequence markers in the 2010s. They found that Montreal and Welland French are currently evidencing patterns of sociolinguistic divergence. In this paper, we probe further this divergence with a comparison of the community trends over four decades and an analysis of variation across the lifespan for a cohort of 12 speakers in each community. This analysis aims to see how these speakers have positioned themselves vis-à-vis the changes underway in each community. Statistical analysis revealed that, over the four decades, in Montreal, vernacular (ça) fait (que) rose, English borrowing so remained absent, and standard alors decreased sharply and lost out to donc. However, in Welland, so rose sharply and concomitantly (ça) fait (que) decreased and, while alors and donc have competed as standard variants, they both underwent a moderate decrease. Analyses of the effects of social factors reveal complex patterns of change in both communities, driven primarily by gender and SES in Montreal and by bilingualism and SES in Welland. The analysis of change during the life-time of the twelve speakers shows that their socio-biographic trajectory (e.g., occupational history) explains why some speakers participate in the ongoing changes while others remain stable or even retreat from them. Mougeon et al. (2016) found that in the 1970s, Welland and Montreal French shared the same sociolinguistic norms; the present study shows that this is no longer the case and that, as far as the consequence markers are concerned, the seeds of change were already present in the 1970s.


Blondeau, H. Mougeon, R. & Tremblay, M. (to appear). Analyse comparative de ça fait que, alors, donc et so à Montréal et à Welland: mutations sociales, convergences, divergences en français laurentien, Journal of French Language Studies.

Martineau, F. & Séguin, M-C. (2016). Le Corpus FRAN : réseaux et maillages en Amérique française, Corpus, no. 15, Corpus de français parlés et français parlés des corpus.

Mougeon, R., Hallion, S., Bigot, D. & Papen, R. (2016). Convergence et divergence sociolinguistique en français laurentien: l’alternance rien que/juste/seulement/seulement que/neque, Journal of French Language Studies, 26, (2), 115-154.

Sankoff, D., Tagliamonte, S. & Smith E. (2005), Goldvarb X: A variable rule application for Macintosh and Windows, Department of Linguistics, University of Toronto.

Cross-varietal differences in prospective/retrospective preference: The perception of final connectives by Irish and American English speakers

Irish English is known as having the retrospective (or final particle) use of but and so in clause-final position (Hickey 2007, Amador-Moreno 2010, Kallen 2013) as in:

(1) It’s all that it is Janie it’s muscular spasm but                             (SPICE-Ireland, P1A-053)
(2) It was you opened the curtains so                                               (SPICE-Ireland, P1A-050)

On the other hand, Mulder & Thompson (2008) observe the lack of final particle but in American English in their comparison with Australian English.

These observations were supported by a comparison of two spoken corpora of American and Irish English (Santa Barbara Corpus of Spoken American English and the comparable categories of SPICE-Ireland, P1A-001 to P1A-100 and P1B-001 to P1B-020). In SPICE, there were nine tokens of the retrospective use of final but and eight tokens of such use of final so, but neither of those final connectives were attested in SBC. This result partly points to a marked preference for the final-tag construction in Irish English, where final tags are defined as retrospective types of pragmatic markers.

Our study further carried out a questionnaire survey targeted on Irish and American English speakers to explore their perception of the two connectives in final position. Given that final connectives allow two possible interpretations (prospective/final hanging and retrospective/final particle), we investigated how final connectives would be interpreted by native speakers of the two varieties. The survey result reveals that the retrospective readings were more favored by Irish English speakers, but the prospective ones were more dominant among American English speakers. An additional interview survey was conducted to ensure a more accurate understanding of the final connectives by American English speakers. The findings verify that Irish English manifests a greater degree of the constructional entrenchment of final-tagged structures, a consequence of which may result in interpretive discrepancies between speakers of the two varieties.


Amador-Moreno, Carolina. P. 2010. An Introduction to Irish English. London: Equinox.

Hickey, Raymond. 2007. Irish English. History and Present Forms. Cambridge: Cambridge University Press.

Kallen, Jeffrey L. 2013. Irish English Volume 2: The Republic of Ireland. Berlin: Mouton de Gruyter.

Mulder, Jean & Thompson, Sandra A. 2008. The grammaticization of but as a final particle in English conversation. In Crosslinguistic Studies of Clause Combining, Ritva Laury (ed.), 179–204. Amsterdam: John Benjamins.

Three vernacular determiners in York English: evidence for discourse-pragmatic factors in grammaticalization trajectories

The variety of English spoken in the city of York (UK) has three vernacular determiners: a zero article, a reduced, vowel-less determiner (Jones 1999), and a complex demonstrative construction of the type this here NP. They are illustrated in (1) with data from the York English Corpus (Tagliamonte 1996–1998).

(1)     (a) And when Ø river come up it used to flood up. (Gladys Walton, 87) [ZERO]

         (b) Does ? teacher play it on the guitar? (Mark Aspel, 24) [REDUCED]

         (c) What is that there red book do you know? (Albert Jackson, 66) [COMPLEX]

In research with Sali Tagliamonte (Rupp & Tagliamonte 2017) we have asked: Why do these vernacular determiners occur in York English, and what is their social and grammatical function? We have probed the occurrence of the vernacular determiners from the joint perspective of language variation and change, historical linguistics and discourse-pragmatics. We have conducted both qualitative and quantitative multivariate analyses (Goldvarb; Sankoff, Tagliamonte and Smith 2015) of the contemporary York English Corpus (1.2 million words; using a socially stratified subsample of 50 speakers) and several historical corpora (including The Oxford English Dictionary and The Penn-York Computer Annotated Corpus of a Large Amount of English 1473-1800).

Noting that the definite article first emerged in the north of England (McColl Millar 2000), we postulate that the vernacular determiners are best understood as representing different stages in the grammaticalization trajectory of the definite article (Greenberg 1978; Lyon’s 1999 Definiteness Cycle). We demonstrate that the three vernacular determiners have acquired new social and discourse-pragmatic uses of conveying Yorkshire identity (Tagliamonte and Roeder 2009), psychological distance (Johannessen 2006) and discoursenew, hearer-old information (Prince 1981). We conclude that rather than having disappeared, the determiners have remained productive. Following Traugott (1995) and Epstein (1995), we envisage that discourse-pragmatic factors may influence grammaticalization trajectories.


Ecay, Aaron. 2015. The Penn-York Computer-annotated Corpus of a Large amount of English 1473-1800 based on the TCP (PYCCLE-TCP). https://github.com/uoylinguistics/pyccle. [Accessed 18 August 2016].

Epstein, Richard. 1995. The later stages in the development of the definite article: Evidence from French. In Henning Andersen (ed.), Historical Linguistics 1993, 159–75. Amsterdam: John Benjamins.

Greenberg, Jospeh H. 1978. How does a language acquire gender markers? In Jospeh H. Greenberg, Charles A. Ferguson & Edith A. Moravcsik (eds.), Universals of human language, vol. 3 Word structure, 47–82. Stanford: Stanford University Press.

Johannessen, Janne B. 2006. Just any pronoun anywhere? Pronouns and “new demonstratives” in Norwegian. In Torgrim Solstad, Alte Grønn, and Dag Haug (eds,), A Festschrift for Kjell Johan Sæbø, 91–106. Oslo: University of Oslo.

Jones, Mark, J. 1999. The phonology of definite article reduction. In Clive Upton & Katie Wales (eds.), Dialectal variation in English. Proceedings of the Harold Orton Centenary Conference 1998. Leeds Studies in English 30: 103–121.

Lyons, Christopher. 1999. Definiteness. Cambridge: Cambridge University Press.

McColl Millar, Robert. 2000. System collapse system rebirth: The demonstrative pronouns of English 900-1350 and the birth of the definite article. Oxford: Peter Lang.

Oxford English Dictionary, 2nd edn. 1989. Oxford: Clarendon Press.

Prince, Ellen F. 1981. Toward a taxonomy of given-new information. In Peter Cole, Radical Pragmatics, 223–255. New York: Academic Press.

Rupp, Laura and Sali A. Tagliamonte. This here town: evidence for the development of the English determiner system from a vernacular demonstrative construction in York English. English Language and Linguistics. http://dx.doi.org/10.1017/S1360674317000326

Sankoff, David, Sali A. Tagliamonte and Eric Smith. 2015. Goldvarb Yosemite: A multivariate analysis application for Macintosh. Toronto, Canada: Department of Linguistics, University of Toronto. [Accessed 10 February 2017] http://individual.utoronto.ca/tagliamonte/goldvarb.html.

Tagliamonte, Sali A. 1996-1998. Roots of Identity: Variation and Grammaticization in Contemporary British English. Economic and Social Sciences Research Council (ESRC) of Great Britain. Reference #R000221842.

Tagliamonte, Sali A. & Rebecca V. Roeder. 2009. Variation in the English definite article: Socio-historical linguistic in t'speech community. Journal of Sociolinguistics 13: 435–471.

Traugott, Elizabeth C. 1995. Subjectification in grammaticalization. In Dieter Stein & Suzan Wright (eds.), Subjectivity and subjectivisation in language, 31–54. Cambridge: Cambridge University Press.

Insubordination and intralinguistic variation: a quantitative corpus analysis of insubordinate subjunctive complement clauses in varieties of Spanish

This paper presents a quantitative corpus analysis of intralinguistic variation with independent subjunctive complement constructions in Spanish, as in (1):

  1. ¡Que te ayude Antonio!
    ‘Antonio should help you!’

This example illustrates the phenomenon of insubordination, the use as an independent clause of formally subordinate clauses (Evans 2007). On the one hand, the construction includes an initial complementizer and a verb in the subjunctive mood, which is typical of subordinate complement clauses. On the other hand, it is syntactically and pragmatically independent, since no candidate main clause material occurs or can be reconstructed in the speaker’s turn or in the preceding turns.

Insubordinate constructions express similar functions to those of discourse markers: interactional, modal and discourse-organizational. In particular, insubordinate subjunctive complement (ISC) constructions can express either (third-person) orders (1), wishes (2) or quoted orders (3) (Sansiñena 2015):

   2.    ¡Que pases un buen día!
          ‘I hope you have a good day!’

   3.     A: ¡Ven!
           B: ¿Qué?
           A: ¡Que vengas!
          ‘A: Come!
​​​​​​​          B: What?
​​​​​​​          A: I told you to come!

Our study addresses understudied aspects of insubordination. Firstly, it has been repeatedly shown in the literature that this phenomenon is found cross-linguistically in spoken language (Evans 2007, Dwyer 2016). Less attention has been paid to the fact that some cases of insubordination can also be found in written genres.  Secondly, some studies have shown that there are inter-linguistic differences in the availability of specific semantic types among closely related languages, in such a way that some languages only allow for some of the meanings/functions shown in (1)-(3) (Verstraete and D’Hertefelt 2016). However, the possibility that variation occurs between regional varieties of the same language remains relatively unexplored (cf. Gras & Sansiñena 2017, Corr 2018).

The aim of this paper is to identify possible instances of variation in the distribution of meanings/functions of ISCs in different regional varieties of Spanish, and pinpoint which aspect of oral language is a determining factor to explain variation amongst genres. For this purpose, we conducted a quantitative analysis of a corpus that represents three varieties of Spanish (Peninsular, Chilean, and Argentinian). Each subcorpus consists of two oral genres (conversation and interview) and two written genres (social media and news reports), which vary in the level of formality, interaction and intersubjectivity.

Preliminary results indicate that all meanings/functions of ISCs are available across all varieties under study –suggesting that they constitute a well-entrenched element of Spanish grammar--, while they are mostly found in spontaneous interactional discourse, either spoken or written.


Corr, Alice. 2018. 'Exclamative' and 'quotative' illocutionary complementisers in Catalan, European Portuguese and Spanish: a study in Ibero-Romance syntactic 'near-synonmy'. Languages in Contrast: International Journal for Contrastive Linguistics 18(1), 72-101. 

Dwyer, A. 2016. Ordinary insubordination as transient discourse. In Evans, N. & Watanabe, H. (eds.) Insubordination. Amsterdam: John Benjamins. 183-208.

Gras, Pedro & María Sol Sansiñena. 2017. Exclamative complement insubordination in Spanish. Journal of Pragmatics 115, 21-36.

Evans, N. 2007. Insubordination and its uses. In Nikolaeva, I. (ed). Finiteness. Oxford: OUP. 366-431.

Sansiñena, S. 2015. The multiple functional load of que. Unpublished PhD. University of Leuven.

Verstraete, J. C. and D’Hertefelt, S. 2016. Running in the family: Patterns of complement insubordination in Germanic. In Evans, N. & Watanabe, H. (eds.) Insubordination. Amsterdam: John Benjamins. 65-88.

Putting the Romance back into reported speech: Evidence from Quebec French, Acadian French, Brazilian Portuguese and Italian

Research on English identifies the quotative system as the locus of rampant variability and innovation (Buchstaller 2014). Although other languages are reportedly witnessing the emergence of new quotatives (e.g., Foolen 2008; Buchstaller & van Alphen 2012), many such claims fail to situate the candidate for change within the wider variable system in which it is emerging and/or do not adduce any real- or apparent-time evidence to demonstrate what has changed.

We address these shortcomings by conducting a comparative variationist analysis of 3,600 quotative tokens extracted from 197 speakers representing four vernacular varieties: Quebec French (QF), Acadian French (AF), Brazilian Portuguese (BP) and Italian (ITA), recorded respectively in Ottawa-Gatineau (2014), north-east New Brunswick (2013), São Paulo (2009-2013; see Mendes 2013), and different regions of Italy (2005; see Cresti & Moneglia 2005). We use these datasets, each containing an apparent-time component, to: (i) identify cross-linguistic patterns of quotative variation; (ii) probe evidence of linguistic change; and (iii) compare trajectories of change in different varieties by operationalizing measures of advanced grammaticalization (e.g., using grammatical person, content of the quote, etc.; see Ferrara & Bell 1995).

The results turn up a number of key findings. In addition to generic speech verbs, all varieties, except ITA, contain quotatives incorporating markers of similarity/manner (e.g., QF/AF être comme ‘be like,’ BP assim ‘like this’). Apparent-time change in BP (involving quotative tipo ‘type’ in speakers aged < 30) and ITA (involving a speaker nominal quotative) is relatively incipient, but it is more salient in QF and AF, where the innovative variant être comme has progressed further along the cline of grammaticalization in QF than AF.

Taken together, the results call for a more circumspect and quantitatively informed assessment of claims that the quotative systems of typologically related languages are experiencing independent parallel innovations (Buchstaller & Alphen 2012: xii).


Buchstaller, Isabelle. 2014. Quotatives: New Trends and Sociolinguistic Implications. Oxford: Wiley-Blackwell.

Buchstaller, Isabelle and Ingrid van Alphen (eds.) 2012. Quotatives: Cross-linguistic and Cross-disciplinary Perspectives. Amsterdam: John Benjamins.

Cresti, Emanuela and Massimo Moneglia. 2005. C-ORAL-ROM. Integrated Reference Corpora for Spoken Romance Languages. Amsterdam: John Benjamins.

Ferrara, Kathleen and Barbara Bell. 1995. Sociolinguistic variation and discourse function of constructed dialogue introducers. American Speech 70: 265-290.

Foolen, Ad. 2008. New quotative markers in spoken discourse. In Bernt Ahrenholz, Norbert Dittmar & Ursula Bredel (eds.) Empirische Forschung und Theoriebildung. Festschrift für Norbert Dittmar zum 65. Geburtstag. Frankfurt am Main: Peter Lang, pp. 117-128.

Mendes, Ronald Beline. 2013. Projeto SP2010: Amostra da fala paulistana. <http://projetosp2010.fflch.usp.br>

A multi-dimensional, multi-functional and multilingual account of discourse marker variation

Discourse markers (henceforth DMs) are the focus of a very rich field of study, investigating their many forms and functions in various languages. However, they are still rarely studied onomasiologically, especially in spoken multilingual data, as opposed to the bulk of contrastive case studies. This presentation aims to analyze the variation in use and functions of a broad bottom-up selection of DMs across three languages from different typological families, namely French (Romance), English (Germanic) and Polish (Slavic). Such an endeavor requires not only to overcome issues of definition and delimitation of the DM category, accounting for the diversity of their forms in different languages through an operational tertium comparationis (Krzeszowski 1981), but also to design an annotation model encompassing their full functional spectrum, in the perspective of spoken discourse analysis.

Our study follows a corpus-based methodology based on Crible & Degand’s (in press) multilingual annotation scheme for functions of (spoken) DMs. The functional taxonomy distinguishes between four domains (ideational, rhetorical, sequential, interpersonal) that may be combined with eleven functions (e.g. cause, contrast, topic-shift). This taxonomy with two independent levels has been applied to spoken unplanned dialogues in the three languages (approx. ca. 30 minutes; between 5000 and 6000 words), resulting in the identification of 286 DMs in English (30 types), 442 DMs in French (35 types), and 847 DMs in Polish (48 types). The annotations were extracted for contrastive analyses of distribution and variation of DMs and their functions. The results indicate that the multilingual annotation scheme may be validly applied to the three different languages, demonstrating that semantic equivalence of DMs attested in different languages does not necessarily lead to functional and distributional similarities between them (e.g. in the case of English you know, French tu vois and Polish wiesz). Currently the annotation scheme is tested on additional languages (Slovenian, Spanish, L2 English, Brazilian Portuguese).


Cible, L. and Degand, L. In press. Reliability vs. granularity in discourse annotation: What is the trade-off? Corpus Linguistics and Linguistic Theory.

Krzeszowski, T.P. 1981. Tertium Comparationis. In J. Fisiak (Ed.), Linguistics: Prospects and Problems, Berlin, Mouton de Gruyter: 301-312.