Singapore weblogs: Between speech and writing

Andrea Sand
Trier University


The aim of the present paper is to shed light on the question whether computer-mediated communication promotes the use of non-standard varieties of New Englishes in writing, e.g. Colloquial Singapore English or Singlish. The database consists of a corpus of weblogs from Singapore as well as informal speech and writing from the Singaporean sub-corpus of the International Corpus of English. The exploratory analysis of the data focuses on three features, which are commonly associated with non-standard, spoken Singaporean English. First, the use of discourse particles, such as lah or lor, is considered, as these are regarded as very typical of Colloquial Singapore English usage. On the level of syntax, zero constituents, i.e. zero subjects, zero objects and zero copulas are examined to shed light on the question whether these substrate-influenced features of Colloquial Singapore English are also used in weblogs where the immediate context of face-to-face interaction is missing. Finally, quotative like, which is associated with spoken English of younger speakers world-wideis studied as an example of a recent innovation on global scale. The analysis reveals that these features are indeed used by the bloggers, but with lower frequencies than in conversations. This is in line with studies of weblogs elsewhere in the Anglophone world.

1. Introduction

Most accounts of the New Englishes are based on a ‘feature-list’ approach, as in McArthur (2002) or Kortmann and Lunkenheimer (2012), for example: A brief survey of the history and present status of a variety is followed by a list of phonological, lexical and grammatical features which are regarded as typical of this variety. We generally do not find out anything about the data base used to compile the feature lists. Nor do we find out whether these features are innovations or retentions of now archaic usage in international Standard English and whether they are frequently or rarely used by the speakers of this particular variety.

Only recently have linguists begun recognizing the importance of context in the study of the New Englishes. The necessity of a context-oriented approach is also pointed out by Mair (2003):

So we might well ask whether, in the study of new varieties of English, it isn’t time linguists recognized the fact that long before ‘Jamaican creole’, ‘West African English’ or ‘Indian English’ end up as decontextualized constructs in linguistic descriptions, they exist as communicative practices available to real people who pursue their mundane aims in specific communities and in very specific historical and social contexts. What is needed is, thus, no more and no less than a discourse-based and dynamic model of varieties of English, which puts the context, the speaker and his/her intentions, and history back into the picture […]. (Mair 2003: xiii)

Focussing on the speakers and writers of a variety, on the linguistic and extra-linguistic context can be achieved by various means. In my previous research (e.g., Sand 2004, 2008) based on various sub-corpora of the International Corpus of English (ICE) it became apparent that many morpho-syntactic variables are heavily influenced by the text type under analysis. Thus I began to focus on individual text types, for example job applications from India (Sand 2012). The study of computer-mediated communication (CMC) appeared especially promising, as previous research had shown that CMC may promote the use of non-standard varieties in writing (e.g., Androutsopoulos & Ziegler 2004; Androutsopoulos 2007 for German). With regard to the New Englishes, Hinrichs (2006) shows the increased uses of written Jamaican Creole in e-mail communication, while Mair (2003a) discusses the uses of English and Jamaican Creole in internet discussion forums. Encouraged by these findings, the aim of the present paper is to investigate the English used in weblogs posted by Singaporean speakers of English to determine whether features generally associated with oral or informal usage are considered acceptable in this rather recent text type. The dynamic sociolinguistic background of English in Singapore will be presented in the next section, followed by a discussion of the database used and some first results with regard to non-standard or generally spoken features.

2. Singapore and Singlish

The ethnic and linguistic make-up of Singapore is quite complex. Of a total population of roughly 3 million people, about 76% are Chinese, 15% Malay, 7% South Asian, mostly from India, and 2% of various other backgrounds, including Eurasians, Europeans and Arabs (McArthur 2002: 338; Wee 2008: 261). The main languages spoken in Singapore are Hokkien, Teochow, Cantonese, Mandarin, Malay, as well as its simplified variety Bazaar Malay, Tamil as well as some other Dravidian languages, English and the local non-standard variety Singlish. Of these, Mandarin, Malay, Tamil and English have official status. English is the language of the administration, law courts and education. In the school system, each child is considered to speak one of the official ‘mother tongues’ based on ethnicity, which means that Chinese children will study Mandarin as a second language, Malay children study Malay and Indian children study Tamil. The ‘mother tongue’ is not necessarily the child’s L1; in fact, a growing number of Singaporean children speak English as their L1, especially in families of high social status (Gupta 1998: 120). The use of Mandarin among the ethnic Chinese has increased drastically over the past decades, supported by the government’s ‘Speak Mandarin’ campaign, which has led to an increased use of Mandarin in many areas of public life, including multi-ethnic workplaces and the media. Since it is very difficult for non-Chinese children to learn Mandarin in school due to the ‘mother tongue’ rule, a large percentage of the population is beginning to feel marginalized and promotes the use of English as a pan-Singaporean language instead (Gupta 1998: 122; Lim & Foley 2004: 5). The number of Singaporeans who speak English customarily has risen steadily and dramatically since the 1970s. In fact, the small percentage of people who do not know any English were generally born before the 1950s (Gupta 1998: 121; Lim & Foley 2004: 6). It is difficult to pinpoint specific domains for the languages spoken in Singapore. Basically, English can be spoken in any domain and may even be used in cultural or religious activities associated with only one ethnic group (Gupta 1998: 123; Low & Brown 2005: 43). In contact with administrative or educational authorities, English is generally the first choice if it is available to both interlocutors. But the interplay of factors like age, social class, ethnicity, language proficiency and setting makes predictions of language choice very difficult.

Tay (1982: 54) reports how English and Mandarin have become the major languages of inter- and intra-ethnic communication in Singapore since the 1970s, to the effect that especially ethnic Indians and Malays very often use English even in intra-ethnic communication instead of Tamil or Malay. Data from the Singaporean census in 2001 (cf. Table 1) shows that the development towards English as a home language has accelerated since the 1980s.

Ethnic group Mandarin or other
varieties of Chinese
Malay Tamil English




















Table 1. Use of English as a Home Language (%) (Census 2000, adapted from Low & Brown 2005: 49).

Compared to data from 1980, when only 10% of the Chinese, 2% of the Malays and 24% of the Indians used English as the predominant home language, the table illustrates the growing tendency to use English as a home language, especially among the more educated, non-Malay part of the population (cf. Lim & Foley 2004: 6; Low & Brown 2005: 39–42). What the table does not show, however, is the fact that the majority of Singaporeans today uses more than one home language, as illustrated in Table 2 below:

Lgs. Spoken at Home Spoken by
English 96% [1]
Mandarin 87%
Malay 18%
Cantonese 13%
Hokkien 9%
Teochew 9%
Japanese [2] 1%
Arabic 1%
Singlish 1%

Table 2. Languages Spoken as Home Languages by CSW Bloggers

In this context, it is also important to point out that English in Singapore is not a monolithic variety, but rather a continuum including an exonormative standard, an educated variety with a distinctive Singaporean accent and less formal varieties, including Singlish or Colloquial Singapore English. As McArthur (2002: 339) writes:

At the higher end is a government-backed normative variety based on British Standard English, spoken with a near-RP accent, used by the Singapore Broadcasting Corporation, and increasingly influenced by American usage. At the other end is a home-grown colloquial variety to which the name Singlish has long been attached […].

In terms of language attitudes, most Singaporeans hold English in high esteem. Tay (1982: 58) writes that the majority of the population strives to become at least bilingual in English and one other language. Standard English is regarded as useful in terms of education, employment, economic development, inter-ethnic and international communication. In terms of expressing Singaporean identity, closeness and belonging, it is Singlish that takes the prize, despite the government’s efforts to ban it. [3] Mei (2001) conducted a survey in 1999, in which 68% of the respondents were inclined to be positive about Singlish, stressing its role as a marker of Singaporean identity and a “unifying force across ethnic boundaries and socio-economic groups” (Mei 2001: 44). However, Singlish is traditionally associated with informal spoken communication, among friends for example, but also in inter-ethnic discourse with less educated Singaporeans, such as taxi-drivers or food vendors (Low & Brown 2005: 35–43). It remains to be seen in which written domains it may be used in the future.

In terms of its linguistic features, Singlish has been classified as a ‘creoloid’ or ‘semi-pidgin’ (Lim & Foley 2004: 9) because it shows signs of massive restructuring due to the influence of Malay and the various varieties of Chinese spoken in Singapore. This is reflected by the absence of Standard English inflections (e.g., 3rd person singular -s, past tense -ed, cf. Alsagoff & Ho 1998: 137–138; Wee & Ansaldo 2004: 63–66; Fong 2004: 77–82), different options for the constructions of passives, relative clauses and questions (cf. Alsagoff & Ho 1998: 145–151; Low & Brown 2005: 93f., 101f., 107f.; Fong 2004: 97f.), differences in noun classification and article use (cf. Alsagoff & Ho 1998: 143–145; Wee & Ansaldo 2004: 58–62) or the occurrence of zero constituents, which will be discussed in section 4.2. below. There are a large number of loanwords from the other languages spoken in Singapore (cf. Wee 1998), most notably the discourse particles, which will be discussed in section 4.1. below. The pronunciation and intonation of Singlish is also very different from international Standard English, which sometimes causes intelligibility problems with non-Singaporean speakers of English (cf. Trudgill & Hannah 2002: 136f.; Low & Brown 2005: 115–180).

3. Database

The most comprehensive corpus of Singapore English to date is the Singaporean sub-corpus of the International Corpus of English (ICE-SIN). It consists of one million words, covering a large range of text types, such as conversations, speeches or lectures as well as private letters, student essays, academic writing or fiction. The individual files consist of 2,000 words each. The largest part of the corpus is section S1A, consisting of face-to-face conversation (180,000 words) and telephone conversations (20,000 words), both representing private informal spoken communication. Section W1B comprises private letters (30,000 words) and business letters (30,000). Of these, only the private letters were used in the analysis to represent unpublished and unedited informal writing. A more detailed account of ICE corpus design and mark-up can be found in Nelson (1996, 1996a) or on the ICE homepage.

A Corpus of Singapore Weblogs (CSW) is presently being compiled at the University of Trier. [4] As a first step, 100 samples of English-language weblogs written by Singaporeans of 2,000 words length matching the private dialogues of ICE-SIN in size were collected. The annotation is a simplified version of the scheme used for the ICE corpora to allow comparability with the ICE data without including information not necessary for the analysis (cf. Nelson 1996a). All bloggers were asked to fill in consent forms for the inclusion of their material in the corpus, in which they also provided information on their sociolinguistic background, e.g., on age group, gender, languages spoken at home, level of education or longer stays abroad, as these factors may influence their linguistic choices. The contributors are modelled on the speakers sampled for ICE-SIN (cf. Lim & Foley 2004: 12). Despite all efforts to compile a balanced weblog corpus with regard to age group and gender, the corpus is biased towards younger female bloggers (cf. Table 3 below). Whether this is due to a higher percentage of younger females in the Singaporean weblog community or their willingness to answer our request to give their consent cannot be resolved at the moment.

Age Group Gender Total Stay Abroad
16–25 f 34 9
m 18 5
26–35 f 20 12
m 18 6
36+ f 2 1
m 8 4

Table 3. Sociolinguistic Background of CSW Bloggers

The home languages spoken by the bloggers in the sample are very much in line with the census data (cf. Table 2 above), as 96% of the respondents use (a variety of) English as one of their home languages. As the informants were asked to fill in the names of their home languages themselves, it can be regarded as a sign of the increased acceptance of the non-standard variety Singlish that one blogger used this term rather than the more generic English, despite the fact that we did not suggest any languages or labels in the questionnaire. It is important to point out that Mandarin (spoken at home by 87% of the informants) is often spoken in addition to another variety of Chinese, such as Hokkien or Cantonese. This may be interpreted as a result of the government’s 1979 ‘Speak Mandarin’ campaign designed to provide a common standard language for all ethnic Chinese in Singapore (cf. Bokhorst-Heng 1998: 287–299). The number of Malay speakers (18%) is also roughly in line with their representation in the overall population. The glaring absence of Tamil as a home language is due to the ongoing language shift of the ethnic South Asians to English as their only language. In terms of their linguistic backgrounds, the bloggers constitute a representative sample of the younger to middle-aged population as a whole. The corpus design thus ensures a high degree of comparability with the ICE data and, at the same time, allows statistical testing of whether any of the sociolinguistic variables influence the presence or absence of linguistic variables in the data.

4. Data analysis

The first exploratory analysis of the CSW corpus presented here was done on the basis of Wordsmith Tools and the SPSS software package. [5] In a first step, the presence or absence of three features strongly associated with Singlish was tested: the use of discourse particles and the absence of the copula, the subject or object of a clause. For the analysis of zero constituents, two smaller sub-corpora of CSW and ICE-SIN consisting of 20,000 words from 10 randomly sampled files each were manually tagged and analyzed. In a second step, a feature associated with informal spoken English on a global scale, namely the use of like as a discourse marker, hedge or quotative, was examined. The selection of these three features for the pilot study is driven by their close association with spoken language. The discourse particles are almost stereotypically associated with spoken Singlish (e.g. Wee 2004; Low & Brown 2005: 175–180), as is discourse like with younger speakers of English across the globe (e.g. Mair 2009: 55–58; Buchstaller et al. 2010). Comparing the uses of these two different types of discourse markers allows us to place the blogs between a local spoken norm and a global development in informal spoken language. Zero constituents are also generally associated with spoken Singlish (Wee & Ansaldo 2004: 71–72). They have to be retrievable from the context and require a higher degree of contextualization by the addressee. It is thus interesting to see whether they would lend themselves to use in a written text type, thus backing the claims that CMC promotes the use of non-standard spoken features in writing (e.g. Hinrichs 2006 for Jamaican English and Jamaican Creole).

4.1 Discourse particles

The use of discourse particles borrowed from Bazaar Malay, Hokkien, Cantonese and Mandarin is one of the hallmark features of Singlish and less formal Singapore English (cf. Wee 1998: 191–196, Wee 2004: 117–126; Lim 2007 for all). According to Platt (1987: 395) only a(h) and la(h) are widespread along the continuum, while other markers, such as hor, lor, ma, leh, and meh, are only used in the basilect. [6] This situation appears to have changed since, as all of these particles are attested in the mesolectal and acrolectal conversations (S1A) recorded in ICE-SIN (cf. Table 4 below). Lah, the most frequent particle in the conversations and the weblogs,is also attested in the social letters (W1B). Mah and hor, which are by far less frequent, do not occur at all in CSW, but lor, the second most frequent particle in the conversations, is attested in the weblogs.

Particles CSW
(200 000 words)
(200 000 words
(30 000 words)
la/lah 13 1493 2
le/leh 2 35 0
ma/mah 0 14 0
lor 7 137 0
hor 0 49 0
Total: 22 (1.1) 1728 (84.6) 2 (0.7)

Table 4. Discourse Particles (frequencies per 10,000 words are given in brackets)

As Platt (1987: 395) claims that some of the particles (e.g., ma, le and hor)are primarily used by speakers of Chinese ethnicity, a logistic regression analysis was performed considering the factors sex, age group, education and home language, but none of these factors or combination thereof led to a truly successful model for the use of discourse particles in CSW. [7] Even if la and lah are considered together, there is only an insignificant preference by women for la(h), but the model as a whole is no longer significant. With regard to home language, the model is highly significant, but shows no significant difference between bloggers of Malay or Chinese background. It thus appears that the particles are used by all Singaporeans, regardless of their sociolinguistic background.

Previous researchers do not always agree with regard to the pragmatic functions of the various particles. Bell and Ser (1983: 1) propose to distinguish two variants of la(h) which differ in vowel length, intonation and pragmatic function. In the written medium, such differences cannot be observed. The transcribers of the ICE conversations used the spelling lah almost exclusively, and the same was true for the bloggers represented in CSW. The spellings appear to become fixed, despite the variability in speech.

The functions of lah as described in Lim (2007: 460) and Wee (2004: 188f.) as marker emphasising the mood of the speaker and as a marker of solidarity and informality, often used as a softener of the speech act, are also represented in the corpora. Compare examples (1) to (5):

(1) So heavy one lah! (ICE-SIN W1B-010)
(2) …, my BMT memories are starting to fade. Probably due to age lah. (CSW-052)
(3) Haiyah, i dunno what u laymen call it lah. (CSW-056)
(4) Mr Yeow said that No lah (ICE-SIN S1A-003)
(5) No lah no wine lah (ICE-SIN S1A-006)

Lor is described as a resignative by Wee (2002), and as marking a sense of obviousness as well as resignation by Lim (2007: 461) and Wee (2004: 122). The examples found in ICE-SIN and CSW confirm this analysis, as illustrated in examples (6) to (9):

(6) A: Then after that? B: Come here lor (ICE-SIN S1A-007)
(7) It’s soooo tiring lor! (CSW-050)
(8) These people are super kiasi and refuse to make themselves responsible for anything by signing their names on record lor (CSW-028)
(9) Ya ya everything clashes with Film Fes[tival] lor (ICE-SIN S1A-025)

While (6) illustrates the function of marking the utterance as obvious, example (7) can be interpreted as both resignative and marking the obvious, as it is written by a mother of a baby who is not getting much sleep. Examples (8) and (9) are more resignative in meaning. With the relatively small number of examples in CSW, a quantitative analysis of the various subfunctions of lah and lor did not appear feasible. Suffice it to say that when bloggers choose to employ discourse particles, the functions are comparable to those in spoken discourse, but the different realizational variants (stressed or unstressed, tone) can only be inferred by readers familiar with them.

4.2 Zero constituents

Singlish has been described as a pro-drop language, as the subject or object of a clause may be left out if the information is recoverable from the context. As both Malay and Chinese are also pro-drop languages, this is regarded as a typical contact feature of Singlish, since Standard English requires the expression of subject and object in its canonical clause structure (Alsagoff & Ho 1998: 147f.; Wee & Ansaldo 2004: 71f.; Low & Brown 2005: 105ff.). Similar to other English-based Creoles, the copula may also not be realized in a variety of syntactic contexts, e.g., before adjective phrases, noun phrases, prepositional phrases or progressives (Alsagoff & Ho 1998: 138f.; Fong 2004: 82–87; Low & Brown 2005: 90ff.). Zero copula is not linked to the substrate languages in previous research, but rather to other varieties of English, such as AAVE (Fong 2004: 99). To determine whether such zero constituents also occur in the CSW data, two smaller subcorpora (cf. section 4. Data Analysis) were manually tagged for zero constituents, The frequencies per 10,000 words are given in Table 5 below:

n/10,000 words ICE-SIN S1A CSW
zero copula 50 6
zero subject 57 13.5
zero object 14 1.5

Table 5. Zero Constituents

Zero constituents are also attested in the CSW data, but less frequently than in the conversations from ICE-SIN. However, in both sub-corpora under analysis, zero subjects were most frequent, followed by zero copula and then zero objects. This can easily be explained by the fact that the weblogs are generally written in the  1st person and thus  1st person singular subjects are very easily recoverable from the context, similar to a face-to-face situation, as illustrated in examples (10) to (12) below. This is also reminiscent of the  1st person diary style, which also allows the omission of the subject. Unfortunately, we have no comparable data from British or American bloggers to check, whether Singaporeans use zero subjects more frequently because they also do so in informal spoken language. In addition to these, there are also less frequent occurrences of 2nd person singular subjects which are not realized, as in example (13):

(10) _ Wanted to uh but so far away (S1A-050)
(11) _ Hope to be there Audrey is helping too (S1A-010)
(12) _ Walked till the soles of my feet ached today & went home to watch the show. (CSW-010)
(13) _ Got girlfriend already? (CSW-020)

Zero objects occur less frequently than zero subjects, most likely because the object is often used to express new information in the clause and thus less likely to be recoverable from the context. Examples (14) to (17) illustrate typical uses.

(14) Uh you can ask him to buy _ lah (S1A-040)
(15) It’s about spirits and ghosts kind of thing You watch _ uh (S1A-030)
(16) Mummy fed him that first day. Asked her to give _ after his milk. (CSW-050)
(17) Mom instructed _ over the phone, short and concise. (CSW-070)

In example (16) two objects are missing, the direct object (the food) and the indirect object (the baby). This is the only occurrence of two missing objects in the CSW data analyzed so far.

The instances of zero-copula were equally divided between positions before adjective and before noun phrases, which is in line with previous research (e.g. Low & Brown 2005: 91). Examples (18) to (21) illustrate these syntactic contexts with data from both corpora:

(18) Aiyah _ so hot Japan you know (S1A-050)
(19) Because if this _ poster then you just do one for both uh (S1A-020)
(20) _Your friend a girl?” I ask. (CSW-100)
(21) We _ so cool we play Donkey yo! (CSW-080)

This is different to the findings by Ho (1993: 145), who found more instances of zero-copula before progressives than before noun phrases, but due to the small number of tokens, the difference is not significant. Interestingly, the CSW data revealed only two instances of zero-copula with progressives, as in example (22):

(22) “Where _ you going after this?” I ask between mouthfuls. (CSW-100)

In a previous analysis of a 20,000-word sub-corpus of ICE-SIN S1A, progressives without the copula were found in 3% of all progressive forms, which is also a medium rate of omission compared to other contact varieties of English (cf. Sand 2005). A similar quantitative analysis of the CSW data will have to be done to show whether the distributions are comparable there.

4.3 like – hedge and quotative

The last feature analyzed for the present paper was the use of like as a discourse marker (hedge or emphatic marker) and as a quotative. This feature, especially the quotative function, has been reported to be spreading from the United States (cf. Butters 1982) throughout the English-speaking world, as for example to England and New Zealand (cf. Buchstaller & D’Arcy 2009), Canada (cf. Tagliamonte & d’Arcy 2004) or Jamaica (cf. Mair 2009: 55–58). Although mainly associated with informal spoken discourse, quotative like has also been attested in CMC, for example in instant messaging (cf. Jones & Schieffelin 2007) or internet newsgroups (cf. Buchstaller et al. 2010). Despite its origins among middle-class girls on the American West Coast, more recent developments have shown that this association does not longer hold and that the feature was also picked up by other groups of speakers elsewhere (cf. Buchstaller & D’Arcy 2009). The question with regard to the Singaporean data was then whether the feature had been adopted in CMC and if it was possible to identify its primary users. Table 6 below summarizes the occurrences of the various discourse functions of like in CSW as well as in the conversations (S1A) and social letters (W1B) of ICE-SIN. The non-quotative discourse functions of hedge and emphatic marker are grouped together, and since the W1B-corpus consists of only 30,000 words, the total numbers are also given as frequencies per 10,000 words.

Hedge/emphasis 67
(81.7% of total)
(97% of total)
Quotative 15
(18.3% of total)
(3% of total)
(n/10,000 )

Table 6. Discourse like

As the table shows, the discourse functions are also attested in the two written text types, but are very rare in the social letters; nor are there quotative uses in the letter data. As was to be expected after the analysis of the Singlish features, the occurrences of discourse like are less frequent in the weblogs, but the functions are similar, as illustrated in examples (23) to (26) below. Examples (23) and (24) are typical hedges or fillers, while (25) and (26) are more emphatic in meaning.

(23) My Dear Vimla, like basic Hi! (W1B-006)
(24) I was like very shocked like he really went to the room and take out stuff and like showed us like eh (S1A-097)
(25) It‘s Friday! Like HOOOOOORAYYYYYY! (CSW-090)
(26) …, the volcano in Iceland erupted and air travel was disrupted for like forever?!!! (CSW-099)

The examples of quotative like shown in (27) to (30) fulfil the same function, but the conversations also contain examples of like without be, as in (27). It also becomes clear that pronominal subjects are preferred, as was to be expected on the basis of the previous research cited above.

(27) I hated it when I have to call up one person and say and like do you tell me about that group (S1A-010)
(28) She‘s like uhm help who‘s that that protagonist in Look Back in Anger (S1A-093)
(29) he was like: “it‘s MY house these people are in […] (CSW-074)
(30) I was like oh man… not some survey… (CSW-098)

The interesting difference between the data from CSW and ICE-SIN S1A is that the percentage of quotative uses is much larger in the weblogs than in the conversations (p<0.001). We also carried out a logistic regression analysis out to determine whether gender, age or any of the other sociolinguistic factors plays a role in the distribution. It was found that when age, sex and level of education are taken into account, no significant factor can be identified. It is only when the home languages are considered that there is a significant result: With bloggers who speak a variety of Chinese as their home language, quotative like is less likely to occur. In addition to that, when the interaction of sex and age is considered, the occurrence of quotative like becomes less likely in older women. When the other non-significant factors are considered as well, the model becomes even better, but the home language is then again the only significant predictor. [8] This result is really unexpected and will have to be followed up with a comparative analysis of the conversation data in ICE-SIN.

5. Conclusions

The exploratory analysis presented here has shown that features associated with informal spoken communication, both locally and globally, are indeed attested in Singaporean weblogs, despite the fact that weblogs are text(s) written in a public domain. The quantitative analyses reveal, however, that the frequencies are much lower than in the conversation data, but higher than in the private written text-type of the social letters. This is in line with findings from media studies on American English weblogs, in which the percentages of non-standard and spoken features also range between traditional written genres and more interactive CMC genres (cf. de Gerdes 2005: 209–122). To find out more about the factors which trigger the use of the Singlish or oral features, the complete corpus will have to be coded for a more thorough multivariate analysis. In addition to that, more features, such as the presence or absence of inflections, the uses of loanwords, the occurrence of exclamations and other discourse markers will have to be examined. The possible correlations between the type of weblog (e.g., diary versus filter or knowledge blog) and the occurrence of non-standard and spoken features will also have to be examined in more detail. Nevertheless, the data has already provided enough evidence to support the claim that CMC could indeed be one of the factors promoting Singlish in written usage. Other factors may also play a role here, for example the positive language attitudes towards Singlish as a marker of Singaporean identity expressed especially by younger speakers (cf. Mei 2001 and Table 2 above) or the fact that English – very often in the form of Singlish – is the first home language for an increasing number of Singaporeans (cf. Gupta 1998). These first results are certainly encouraging for future research into CMC in Singapore, for example discussion forums and other more interactive text types.


[1] More than one language could be provided by the informants.

[2] Japanese is not a language commonly spoken in Singapore, but the informant had immigrated to Singapore with her parents at age 1 and was therefore included in the corpus despite the additional home language of Japanese.

[3] See, for example, Gil (2003: 269) on the government’s efforts to eradicate Singlish by banning it from TV and movies, but also through an on-going educational campaign called the Speak Good English Movement (cf. SGEM homepage).

[4] I would like to thank my research assistant Franziska Hackhausen for doing the lion's share of the work in the compilation of the corpus, as well as its annotation and coding. Without her, the corpus would not yet exist. Thanks also to Bernd Elzer, who helped with the coding of several features and the proofreading of this paper.

[5] I would like to thank Daniela Kolbe for her tremendous support in the statistical evaluation of data. All errors are of course my own responsibility.

[6] The meanings of the particles will be discussed below. As for their pronunciation, la(h) /la/ exists in a long, stressed form and in a short, unstressed form and is pronounced with different intonation patterns depending on its function. Ah /a:/ and hor /ho/ are generally used with a rising tone, while eh /e:/ is used with a falling tone. Lor /lo:/ and meh /me:/ are used with a high level tone, while ma /ma:/ is usually pronounced with a mid level tone (Low & Brown 2005: 176–179).

[7] We could not perform the logistic regression on the ICE data, as the necessary sociolinguistic information is not available to us.

[8] The model as a whole is significant, however, with chi square p=0.49 and Nagelkerke R² 0.31, correct prediction 70.7%. The model summary for home language and age*sex is significant with chi square p=0004, Nagelkerke R² 0.31 and an increase of correct prediction to 78.7%. The odds ratio for home language is 0.066, p=0.003. When all factors are considered, significance is higher with chi square p=0.002, Nagelkerke R² 0.41 and an increase of correct prediction to 80.3%. The odds ratio for home language is 0.049, p=0.010. The odds ratio for age*sex is <0.001, p=0.99.


