Translation data problems


Andrew Chesterman


                     [2008f.        In J. Lindstedt et al. (eds), S ljubov’ju k slovu. Festschrift for Arto Mustajoki. Helsinki: Department of Slavonic and Baltic Languages and Literatures, 17-26.]



1. Why is a translation like a joke?


In 2000 Arto Mustajoki and a group of colleagues and students published a bilingual book of Russian jokes translated into Finnish. His brief introduction touches on several of the points I will develop below. – The whole collection is a translation, with both source texts and target texts visible, and thus open to a translation assessment. The translations were done by students, i.e. not by professional translators; but they have nevertheless been published. Mustajoki comments on the special role of jokes in Russian culture, and the culture-bound nature of many examples. He points out that the Russian word anekdot corresponds to both “joke” and “anecdote” in Finnish, and that one of the Finnish words for a kind of joke (sutkaus) has a Russian origin: there are thus complex relations between the terms used in the two languages. And he concludes that the book will have reached its aim if the reader occasionally laughs: one cannot expect everyone to laugh at every joke: after all, humour is largely a matter of taste.

Why is a translation like a joke? No, the answer I have in mind is not: “they can both make you laugh”. The comparison invites a number of observations about the problematic nature of translation data, and raises issues that have still not been resolved in Translation Studies. Translation research makes use of many kinds of data. Apart from translations themselves, and their source texts, and comparable non-translated texts, data may come from questionnaires, interviews, translation reviews, translator protocols and recordings, measurements of eye movements, archives, observational fieldnotes, and so on. This paper focuses on translations themselves as data, and in particular on the question of definition. Definition problems lead to data problems, and these in turn lead to generalization problems – and this is no joke.


2. The quality issue


Consider first the quality issue. Translation theory students have long been taught that modern Translation Studies no longer sets out to be prescriptive, telling translators how they should translate, but has descriptive and explanatory aims, just like any other academic discipline. I have argued (Chesterman 1999) that if we study how clients, readers etc. react to translations, we can find out something about quality expectations in different contexts, and thus include quality considerations within a descriptive framework. But the quality issue is still a problem when it comes to selecting data.

              Is a bad joke still a joke? If no-one laughs, or even smiles? Of the jokes in Mustajoki’s collection, let us assume that some provoke more laughter than others: a higher proportion of readers laugh at these, and/or they laugh louder and longer. But the poor jokes are surely still jokes. They simply belong to the joke subset “bad jokes”. In the same way, a bad wine is still a wine, a bad day is nevertheless a day. I imagine that, as these jokes were collected, there was some kind of selection process. Some were excluded, as they were not considered “good enough”. But I doubt whether any were excluded on the grounds that they were “not jokes”: the very fact that they were on the table for consideration in the first place meant that they were taken to be jokes of some kind – in someone’s opinion, at least. Jokes that are not funny to one person may still be funny to someone else.          

But can a joke be so bad that no-one takes it as a joke? And if so, is it still a joke nevertheless? It would of course be impossible to investigate this empirically and universally (it would involve quite a bit of translation, for a start). But there is always at least one person who does think that our imagined appalling joke is a joke: the joke-teller. The teller’s intention is that this is a joke and shall be taken as a joke. If no-one laughs, the teller is disappointed, but this does not necessarily change his/her categorization of the utterance as a joke – an intended joke may not be successful, but that does not exclude it from the joke category unless we define this category as also requiring the laugh/smile  of at least one hearer, in addition to the teller’s intention.

              Can a “translation” be so bad that it must be classified as a non-translation, outside the set comprising translations of all kinds? In comparison with jokes, the translation case is more complicated. On the receiving end, some readers may have access to the source text or the source language, which may well affect their judgement of whether a given bad translation is actually a translation or not. On the production end, we have a huge range from skilled professionals to unskilled amateurs. Most textual research in Translation Studies works with translations done by assumed professionals (or future professionals); the translations have mostly been published, for instance. We are thus neglecting most “natural” translation done by e.g. by children or bilingual speakers in general. (True, there are professional joke-tellers as well, stand-up comics etc., but we do not necessarily think their jokes are better.) We also find that some people translate out of their first language as well as into it: this may affect the quality of their work, on some criteria, but it does not of itself disqualify such texts from membership of the set of translations. And then there is the question of the unit in question, the translation status of which is to be judged.

              The term “translation unit” normally refers to the portion of source text that a translator deals with cognitively as one chunk, during the formulation of a target-language version: typically, this is a clause, or perhaps a sentence. This is not the sense of “unit” I mean here. The unit problem I refer to is this: what is the nature of the segment of text in the target language about which the judgement “translation / not a translation” is to be made? Let us call this the “judgement unit”, and take an example, which also shows how relevant this problem is to the notion of bad translation.

              On my desk I have a leaflet describing trekking excursions in Lanzarote, run by a Canary trekking company. The text is in three languages; Spanish, English and German. I take Spanish to be the source language, since it appears first and Spanish is the local language. The format and layout of the three versions is the same, describing three trekking routes. There can be no question that the English and German versions are intended to be, and are read as, translations of the Spanish. As usual in such tourist brochures, the English is sometimes odd and there are occasional spelling mistakes (“You will perfectly understand how works a volcano”; “It is essencial [sic] to wear trainers or hiking boots, light jacket, water and picnic”). But there is no reason to deny even these bits the status of translations, albeit non-native and/or careless. A non-native English reader may not even notice these slips. A reader with a knowledge of Spanish may notice some minor syntactic and semantic shifts at these points in the text (from the original Podrá entender el funcionamiento de un volcan and Imprescindible: Calzado cerrado, abrigo ligero, agua y picnic), but nothing unusual.

However, the text also has some surprises. In Spanish, we are told the length of each trek: the first is 6 Km, the second 7 Km and the third 4 Km. In English these lengths are given as 6 Km, 8 Km, and 4 Km. (All three versions have capital “K” here, which is not normal in Spanish or in English. Was the text translated by a German speaker?) So the second trek is one kilometre longer in English. In German the lengths are: not given, not given, and 4 Km. – Why these discrepancies? Carelessness? Maybe. But now consider the question of the judgement unit. Given the Spanish source-text segment 7 Km, shall we say that the English 8 Km is a translation of it, or an erroneous translation, or not a translation at all? Is the omission of this information in German also a translation? After all, the strategy of omission is a valid translation solution in some circumstances.

              I suggest that if this segment is taken out of context, 8 Km cannot be regarded as a translation of 7 Km. However, the implicit standard solution has been to assume that the judgement unit is the largest one possible, i.e. the complete text. In this case, we take the whole English and German versions to be translations, yes, although they contain errors and curious omissions at the level of smaller segments. A text as a whole, then, can be judged a translation even though it contains instances of non-translation, or inaccurate, unfaithful, non-equivalent renderings. Recall that whole books can be sold as translations even though chapters or other large segments have been omitted or censured. And a gist translation, aiming to provide no more than a summary of the source text, is also a valid form of translation. Taking the whole text as the unit for judging whether or not we are dealing with a translation is also the normal solution in research on translation recognition (e.g. Tirkkonen-Condit  2002).

              To summarize so far: it is likely, then, that any randomly selected corpus of translations is likely to contain instances of bad translation, possible including  segments of non-translation – perhaps because the translator was non-professional, working into a non-native language, or simply careless or short of time. Any descriptive theory of translation will have to allow for this fact. This causes difficulties for at least two theories: skopos theory and norm theory.

Skopos theory (from Reiss and Vermeer 1984 on) assumes that a translation seeks to fulfil its skopos (purpose), and that this skopos is the primary determining factor affecting choices of translation strategies and solutions. The trouble is that not all translations fulfil their purposes in an optimal way (and some may not fulfil their aim at all, if they are really bad), and skopos theory thus seems to be describing an ideal situation rather than a real one. Its generalization does not seem to hold for all possible members of the set “translation”.

Norm theory (e.g. Toury 1995) assumes that translators are influenced by the prevailing norms about what translations should be like. Ideal translators, anyway... But what about translators who do not know much about the relevant norms, or do not care about them, or are unable to meet them? They too produce translations. Two solutions are offered. One is simply to classify such texts as norm-breaking translations, i.e. not as non-translations. Norm-breaking may lead to sanctions (e.g. rejection of the translation by the client), but it may contribute to the establishment of new norms (a classic Finnish example is Saarikoski’s translation of Salinger’s Catcher in the Rye; see e.g. Koskinen 2007). The other solution is to loosen the definition of norms. Some scholars (e.g. Hermans 1999) have argued that norms should be regarded more generally as expectations; and expectations may concern what is typical rather than what is correct. Tourist brochure translations are seldom written in an elegant, native-speaker style; we expect them to sound odd. In this sense, the oddness of my sample text above does indeed meet our expectations.

              So we return to the larger point: given a text T, how shall we decide whether it is a translation or not, however bad it may be?


3. The definition issue


Both jokes and translations can be seen as forms of oral or written (or multimedial)  text, embedded in a discourse. Formally, jokes have at least one clear distinctive feature: the punch line at the end. If there is no punch line, there is no joke. Jokes often also have an initial marker (such as “Did you hear the one about...?”), but this is not obligatory. With translations, the situation is not so clear: not all translations are supposed to raise a smile, for a start. Competing definitions abound.

              Perhaps the most influential attempt to specify a set of distinctive features for translations is Toury’s three postulates (1995: 33f). Strictly speaking, these are presented as conditions which, if they hold, support the assumption that a given text T is a translation. The postulates are, in my paraphrase: (1) there is, or has been, a source text; (2) text T has been derived from this source text via a transfer process; and (3) there is a relation between text T and this source text which we can call some kind of equivalence. The nature of this equivalence relation is not specified a priori, however; it depends on culture-specific norms, and these may vary widely. – These postulates have not gone without criticism. A translation may derive from several source texts (so-called eclectic translations), perhaps in different languages. Postulates 2 and 3 seem to overlap. And what about the status of texts that may fulfil postulates 1 and 2 but not 3? These would break the culture-specific norms concerning the intertextual relation between source and target text – i.e. they are “not equivalent enough” in some way – but they might still be taken to be translations, by some people at least... On the other hand, the postulates do have the advantage of widening our dataset to allow the inclusion of a much wider range of assumed translations than is allowed by a narrower definition.

For instance, on these criteria, the dataset of assumed translations also includes pseudotranslations, as Toury indeed points out: these are texts that are presented as translations, but are later proved not to be. Literary history has many famous examples, one being Voltaire’s Candide. It thus makes perfect sense, in the case of a pseudotranslation, to say: I first thought it was a translation but then realized it wasn’t. (Or even, in the converse case of pseudo-originals: I thought it was an original, but it turned out to be a translation.) What about jokes? The situation seems parallel. One can say: I thought it was a joke, but then I realized it wasn’t. One can also say: I first took it seriously, but then realized it was a joke.

This pinpoints a crucial insight: both joke-telling and translating are communicative acts. Both involve an intention and an associated claim by one party, and acceptance of these by another party. In the case of translation, the two relevant parties are the translator and a receiver (client or reader). On completing and submitting a text as translation, a translator makes the implicit claim that the text in question is indeed a translation – the translator’s intention is that the text should be accepted as such – and also the implicit claim that the translation is adequately equivalent to the source text (Pym 1995). In other words, the implicit claim is that the text represents its source in some relevant way, both as proxy (standing for the source) and in terms of relevant resemblance (Hermans 2007). Compare the joke-teller’s implicit claim or intention to be telling a funny story. But this claim may be disputed. A reader (or many readers) may disagree; we may not take the intended joke as a joke. The claim may also be mistaken, or insincere. A translator may claim that the text represents the source, even though in fact it misrepresents it: throughout history there has been no shortage of (intended or unintended) distorted translations. (For a discussion of recent examples, see Baker 2006.)

Ultimately, therefore, the status of the text as a translation or not is a matter to be negotiated and agreed – usually, of course, only implicitly, except in cases of dispute. In the final analysis the same holds for all definitions: they are all interpretive hypotheses, revisable agreements accepted by a community. (For further discussion of Toury’s and other proposals for formal definitions of translation, see Pym 2007; and also the special issue of Target 2007, 19, 2, on the metalanguage of translation.)

So, in addition to the fuzziness introduced by the quality issue, we have further sources of fuzziness, introduced by the inevitable role played by different agents, at different times and in different cultures, in defining what is to be accepted as a translation.


4. What kind of concept are we looking for anyway?


There is still a third source of fuzziness. The Translation Studies community has not even yet agreed on the kind of concept we should be trying to define.

              Since there is are obvious fuzzy borders around the concept, e.g. between “translation” and “adaptation”, we cannot be dealing with a classical, clear-cut concept. One proposal has been to see the concept as a prototype, with typical forms of translation at the centre and less typical forms on the periphery. Halverson (1998) presented some empirical, experimental evidence in favour of this interpretation. At the centre of her model she places published professional translations, at least for the industrialized West, but allows quite a variety of other types around it, including natural translations and “student translations”. She also accepts that all prototypes are culturally determined and may thus vary.

              Against the idea of translation as a prototype concept, others such as Tymoczko (1998) have argued that precisely because prototypes themselves tend to vary across cultures and times, we cannot formulate a prototype concept that would be valid universally. Compare Toury’s view of translation as norm-governed activity, mentioned above: norms are not universal. This does not actually contradict Halverson’s position, but Tymoczko draws a rather different conclusion. In place of a prototype concept, Tymoczko offers Wittgenstein’s notion of a cluster concept, like that of “game”, whereby different instances of the category are linked only by family resemblances, not by a set of shared essential features. In Chesterman (2006) I explored the extent to which the key semiotic features of similarity, difference and mediation, proposed as fundamental to translation by Stecconi (2004), are differently represented in the etymologies of words denoting translation in a variety of languages, leading to different kinds of conceptual clusters. It seems that in some languages, such as most Indo-European ones, words denoting translation cluster around the notion of transferring something that remains unchanged, whereas corresponding words in some other languages cluster more around the notion of difference (such as the Finnish verb kääntää ‘translate’, for instance, which literally means  ‘turn, change direction’).

              This issue – of the type of concept we are dealing with – remains open. The fuzziness is acknowledged, but there is no general agreement on whether translation is best conceived of as a prototype or as a cluster concept, or perhaps as some combination of both. What both prototype and cluster concepts have in common, however, is that they do not assume a universally valid, essentialist version of the concept in question. Variation across place, culture and time, as well as context, is fundamental – as also in the case of jokes.

              Why should this be a problem?


5. So what?


I have outlined three sources of fuzziness in Translation Studies, which all make it difficult, if not impossible, to establish clear criteria for what counts as a translation: the problem of bad translations, the context-boundness of definitions of translation as a communicative act, and the unclear nature of the translation concept itself. This means that it is not self-evident what we should accept as translation data. (Do we include the Canary trekking text or not? All its segments?) The consequence is problematic if there is an expectation that Translation Studies, like any other academic field, is interested in formulating generalizations about its object of study. It simply does not seem possible to make justified claims – about tendencies or patterns or regularities or laws – that are valid for all translations, universally.

In the prescriptive tradition this has long been recognized. At least since Savory’s famous list (1968) of contradictory prescriptive statements – translations should give the words of the original / the ideas of the original;  translations should read like an original work / should not read like an original work; etc. – we have been wary of universal prescriptive generalizations. It has long been obvious that useful prescriptive statements depend on the purpose of the translation, the text type, and so on. The only valid universal prescriptive guidelines are therefore very abstract indeed, and even risk being somewhat trivial, such as “a translation should aim to fulfil its purpose”.

              Nowadays, despite the interest in research on translation universals (e.g. Mauranen and Kujamäki 2004), we realize that we must also be wary of universal descriptive generalizations. Any claim starting “all translations are...” raises doubts. Many projects have tested hypotheses about so-called translation universals, only to find that in the material studied the universal trait is not so clearly manifested after all, or even that there is more counter-evidence than evidence. The evidence for and against one such hypothesis – the retranslation hypothesis, which claims that later translations of a given text, into the same target language, tend to be closer to the original than earlier translations – is discussed in detail by Koskinen and Paloposki  (forthcoming). The writers come to the conclusion that the hypothesis is quite simply false, if it is assumed to apply universally.

Does this mean that we must simply give up all attempts to generalize? Are all generalizations in Translation Studies a danger, as Agorni (2007) argues? Agorni’s proposed solution is “localism”, i.e. to focus on the local contingent conditions of each particular case. But this seems unnecessarily restrictive. Without neglecting the particular, we must surely seek to go beyond it and look for some kinds of general patterns. But it must be acknowledged that interesting empirical generalizations can only be conditioned ones, restricted to a given quality of translation, a given cultural, historical and linguistic context, and a given definition. This may be one reason why so much of translation research is based on case studies.

This is not to say that such limited generalizations are not useful. On the contrary. Working with a cluster notion of translation, for instance, we are finding many interesting and useful things to say about translations of certain kinds in certain contexts. We can even note similarities between one cluster and another: between the belles infidŹles of 18th-century French literary translation, for instance, and the strongly adaptive translations that are found in modern localization.

One important consequence of this situation is the need for a systematic typology of such clusters, or translation types, based on a repertoire of possible variables, such as those mentioned above: quality, historical and cultural context, purpose and so on. We are still some way from achieving such a goal.

It may finally be of some consolation to recall that interesting concepts always defy precise definitions – they are always open to new interpretations. That’s why they remain interesting. In this respect, translation is in the good company of a whole host of other fuzzy concepts ranging from language itself to humour.





Agorni, Mirella 2007. Locating systems and individuals in translation studies. In Michaela Wolf  and Alexandra Fukari (eds), Constructing a sociology of translation, 123-134. Amsterdam: Benjamins.

Baker, Mona 2006. Translation and Conflict: A Narrative Account, London:  Routledge.

Chesterman, Andrew 1999. The empirical status of prescriptivism. Folia Translatologica 6. 9-19.

Chesterman, Andrew 2006. Interpreting the meaning of translation. In Mickael Suominen et al. (eds), A man of measure. Festschrift in honour of Fred Karlsson on his 60th Birthday, 3-11. Turku: Linguistic Association of Finland.

Halverson, Sandra 1998. Concepts and categories in Translation Studies. Bergen: University of Bergen.

Hermans, Theo 1999. Translation in systems. Manchester: St. Jerome Publishing.

Hermans, Theo 2007. Translation, irritation and resonance. In Michaela Wolf  and Alexandra Fukari (eds), Constructing a sociology of translation, 57-75. Amsterdam: Benjamins.

Koskinen, Kaisa 2007. Pentti Saarikoski (1937-1983). In H.K. Riikonen et al. (eds), Suomennoskirjallisuuden historia (2), 501-506. Helsinki:  SKS.

Koskinen, Kaisa and Outi Paloposki (forthcoming). Sata kirjaa, tuhat suomennosta. Näköaloja uudelleenkääntämiseen.

Mauranen, Anna and Pekka Kujamäki (eds)  2004. Translation universals. Do they exist? Amsterdam: Benjamins.

Mustajoki, Arto et al. (eds) 2000. Venäläinen ruletti. Venäläisiä vitesjä suomeksi ja venäjäksi.  Helsinki: Slavistiikan ja baltologian laitos.

Pym, Anthony 1995. European Translation Studies, une science qui dérange, and why equivalence needn’t be a dirty word. TTR 9(1). 153-176.

Pym, Anthony 2007. On history in formal conceptualizations of translation. Across Languages and Cultures 8(2). 153-166.

Reiss, Katharina and Hans J. Vermeer 1984. Grundlegung einer allgemeinen Translationstheorie. Tübingen: Niemeyer.

Savory, T.H 1968. The art of translation. London: Cape.

Stecconi, Ubaldo 2004. Interpretive semiotics and translation theory: the semiotic conditions to translation. Semiotica 150. 471-489.

Tirkkonen-Condit, Sonja 2002. Translationese – a myth or an empirical fact? Target 14(2). 207-220.

Toury, Gideon 1975. Descriptive Translation Studies and beyond. Amsterdam: Benjamins.

Tymoczko, Maria 1998. Computerized corpora and the future of Translation Studies. Meta 43(4). 652-659.


Back to publications list