Productivity and Lexical Pragmatic Features in a Contemporary CAT Environment : An Exploratory Study in English to Japanese

As the translation profession has become more technologized, translators increasingly work within an interface that combines translation from scratch, translation memory suggestions, machine translation post-editing, and terminological resources. This study analyses user activity data from one such interface, and measures temporal effort for English to Japanese translation at the segment level. Using previous studies of translation within the framework of relevance theory as a starting point, various features and edits were identified and annotated within the texts, in order to find whether there was a relationship between their prevalence and translation effort. Although this study is exploratory in nature, there was an expectation based on previous studies that procedurally encoded utterances would be associated with greater translation effort. This expectation was complicated by the choice of a language pair in which there has been little research applying relevance theory to translation, and by contemporary research that has made the distinction between procedural and conceptual encoding appear more fluid than previously believed. Our findings are that some features that lean more towards procedural encoding (such as prevalence of pronouns and manual addition of postpositions) are associated with increased temporal effort, although the small sample size makes it impossible to generalise. Segments translated with the aid of translation memory showed the least average temporal effort, and segments translated using machine translation appeared to require more effort than translation from scratch.


Introduction
In the years since the commercial introduction of translation memory (TM) tools circa 1992, specialised translation and localisation have become highly technologized.At that stage, machine translation (MT) and TM were clearly differentiated and industrial application of MT was sporadic (Vasconcellos/León 1985).The improvements in MT quality made possible by the paradigm shift to statistical machine translation (SMT) led to more widespread industrial use of MT for 'gisting' and for post-editing, the latter task most usually carried out within a TM interface (O'Brien/Moorkens 2014).More recently, the lines between TM and MT have blurred, and many translators find themselves working with a variety of tools and inputs within a single interface.Opening a source text segment may result in automatic propagation of a target text (TT) suggestion from the TM if the new source segment and that in the TM match closely.Otherwise, the translator may be offered output from one or other sources of SMT output.In addition, there may be glossary entries found in the source text, with associated target terms suggested to the translator.The introduction of TM tools brought an associated focus on productivity and, with its associated system of discounts, cost (García 2006).This focus has been heightened with the increased use of other technologies such as MT and some external factors, particularly the global recession that began in 2007(DePalma et al. 2013)).Research on translation with TM or MT post-editing has often measured effort required by the translator (Krings 2001, O'Brien 2006), most usually using measures of productivity or temporal effort (O'Brien 2005, Carl et al. 2011), as productivity is of particular industrial interest in applied translation research.Several studies used effort measures to compare translation assisted by TM and MT post-editing (Guerberof 2012).Temporal effort has been measured using keylogging software, purpose-built translation process research tools such as Translog (Carl 2012), and more recently by an underlying process within the 'instrumented' interface (see Moorkens et al. 2015 and Section 2 in the current study).In this study we measure temporal effort in terms of processing speed in words per second within a contemporary computer-aided translation (CAT) environment, and carry out a linguistic analysis within the framework of relevance theory (Sperber/Wilson 1986/1995, Wilson/Sperber 1990) for translation.The aim of this exploratory research is to search for patterns of edits or additions to the target text that are related to inferential communication and to investigate whether there is a relationship between these phenomena, the type of computer assistance used, and temporal effort in CAT.In the following sections, we place this research into context, detail our methodology for this study, and present our findings.Krings (2001) suggested measurement of post-editing effort in three ways: temporal, technical (number of edits), and cognitive effort.Krings' own work and later studies (such as Moorkens et al. 2015) employed all three measurements of translation effort.Studies of post-editing have sometimes analysed 'negative translatability indicators', features that create difficulties for MT in the source text, such as segment length, or long noun phrases, and then compared their effects to measures of post-editing effort (O'Brien 2005).

Context of this research
Linguistic analyses of translation have been published in the fields of translation studies and localisation.Translation studies research has identified many phenomena common to translation, such as explicitation (Blum-Kulka 1986), but the advent of electronic corpora meant that these "universal features of translation" could be more easily analysed (Baker 1993: 242).Baker (1996) noted the increased emphasis on the target text as corpus analysis for translation became more common, allowing for large-scale testing of target features such as simplification, explicitation, and normalisation, and related attempts to formulate a "typology of translated text" (176).In localisation and specialised translation research, typologies have become common in industrial settings, particularly for the analysis of errors in target texts, and a succession of typologies have been published for human translation (such as the LISA QA Model), SMT output (Vilar et al. 2006), and for post-edited texts (De Almeida 2013).Temnikova (2010) created an MT error typology that takes predicted cognitive effort into consideration, and this was used in a study measuring temporal post-editing effort in Koponen et al. (2012).
There have, however, been relatively few applications of relevance theory to translation.In the first of these, Gutt (1989Gutt ( , 2000) ) explained translation decisions made by the translator when interpreting the source text for their own audience, using examples from translations of poetry, literature, marketing material, and the Bible.Alves/Gonçalves (2013) analysed specialised translation data (introductions to research articles from scientific journals), measuring instances of conceptual and procedural encoding of words and expressions along with technical translation effort (the number of edits) as recorded by translation process research (TPR) software.They found a relationship between post-editing effort and cognitive effect that supports a "relevance-theoretic account of processing effort in translation" (Alves/Gonçalves 2013: 121).In a follow-on study, encompassing measurement of temporal and cognitive effort (using TPR software and an eye-tracker, respectively), Alves et al. (2014: 169) concluded that the processing of procedurally-encoded problems required the "allocation of longer stretches of time, and eventually more processing effort".If the presence of conceptually or procedurally encoded words in a segment 1 could be used as a predictor of translation or post-editing effort, this would present an opportunity to minimise the use of procedural items in source texts intended for translation.The current research builds on the work of Alves et al. by examining relevance-theoretic features in translation beyond procedural and conceptual encoding.
Relevance theory (Sperber/Wilson 1986/1995) is a cognitively-grounded theory of communication.It explains human communication in terms of two principles of relevance: Cognitive Principle of Relevance Human Cognition tends to be geared towards the maximisation of relevance.

Communicative Principle of Relevance
Every act of ostensive communication communicates a presumption of its own optimal relevance (Sperber/Wilson 1986/95:260).
The first, the Cognitive Principle of Relevance, reflects how human cognition develops so that one pays attention to what is relevant to them; in other words, how one tends to pay attention to what makes a change in our cognitive environment.This cognitive tendency is related to the manner in which we succeed in communication, or, the Communicative Principle of Relevance.This second principle predicts that a communicator, by the very act of claiming an audience's attention, suggests that the information she is offering is relevant enough to be worth the audience's attention (Sperber/Wilson 1995: 260).That is, when a speaker tries to claim our attention by providing a communicative stimulus, we can safely assume that whatever the speaker wants to communicate must be worth our attention, i.e. that it makes some changes to our cognitive environment.In other words, the ostensive stimulus is relevant enough for it to be worth the addressee's effort to process it, and is relevant to and compatible with the communicator's abilities and preferences (Sperber/Wilson 1986/95: 270).This is called the presumption of optimal relevance.
Relevance theory acknowledges that not all utterances are used in the same way.Some could be used descriptively, to state the status of affairs with truth-value, and others could be used interpretively, in order to communicate a thought that belongs to someone else.For example, if the speaker looks out of the window and says 'it's raining,' his utterance is a description of the state of affairs and is true if and only if it is indeed raining.On the other hand, if the speaker reports someone else's thoughts, as in 'Mary says it's raining', then the representation 'it's raining' is used interpretively, regardless of the truth-value of the original utterance by Mary.Gutt's (1989Gutt's ( , 2000) ) works on translation are largely based on this distinction.
In the previously mentioned research on translation effort, Alves/Gonçalves (2013) and Alves et al. (2014) analysed translated texts in terms of the relevance theoretic distinction of procedural and conceptual encoding.These are the two types of linguistic encoding that are acknowledged in relevance theory (Blakemore 1987, 2002, Wilson/Sperber 1993).Conceptual encoding involves what has traditionally been considered as 'coding' and is a constituent of 'a language of thought'.For example, apple encodes the concept APPLE, and desk encodes the concept DESK.In contrast, when an expression encodes procedural information, it encodes instructions for how an utterance is to be interpreted.Expressions standardly analysed as procedural within relevance theory are so-called discourse connectives, pronouns, and modal particles.However, in recent years, it has been suggested that there might be procedural-conceptual hybrid expressions.Indeed, Wilson (2011) suggests that there is no theoretical ground to reject the idea that some expressions might encode both a procedure and a concept.These expressions would include words such as hardly or almost that encode both a procedure and a concept.Interestingly, Carston (2016) pointed out that the notion of procedural encoding now includes a vast range of items, "some of the deepest components of I-language, 2 such as pronouns […] together with communicative devices […] which would seem to fall well outside I-language" (Carston 2016: 155).This raises a question of what 'decoding' ultimately is.However, this is entirely beyond the scope of the current study.
While analyses by Alves/Gonçalves (2013) and Alves et al. (2014) suggested that translating procedural items required more editing effort, their analyses were focused on the English to Brazilian Portuguese pair.Therefore, it is not entirely clear whether the same is true for other language pairs, especially those that are distant from each other such as English to Japanese.The syntax of Japanese is generally S-O-V based, although most sentence components can be permutated, except for the main verb that needs to appear at the final position.This permutation is possible as the grammatical function of each phrase is often marked by particles (also known as postpositions) or case markers, unlike English, where the grammatical function of each word or phrase is often marked by the word order.In addition, unlike Indo-European languages such as English, explicit subjects are often unnecessary in Japanese and pronouns are seldom used.
Another interesting feature of Japanese is its rich repertoire of connective expressions.Some Japanese connectives are independent as a lexicon while others are marked as part of verb conjugation or inflection.In addition, noun phrases such as those using toki (time) can be used as connective phrases.For example, in the example below, toki is used as a noun in (a) while it is used as part of a connective phrase in (b): Some of these connective expressions have been analysed in terms of procedural encoding (cf.Matsui 2003, Sasamoto 2008) but others, especially temporal expressions such as toki seem conceptual (see 3.2 for further discussion).In comparison, English has a limited range of connective expressions such as so or and, and these are mostly analysed as encoding procedure.In the following sections, we examine whether the prevalent overlapping of procedural and conceptual encoding in Japanese results in findings in English-Japanese that differ from those in studies by Alves et al.

Methodology
While research on translation has tended to focus on European language pairs, this study uses naturally occurring translation industry data in English to Japanese, a language pair that presents automatic translation systems with major word order difficulties (Goto et al. 2012).The data were created within an open source translation interface (Omega-T3 ) that combines TM, MT, and terminological assistance for users.The version of the Omega-T software used is called iOmega-T (Moran et al. 2014), and incorporates 'instrumentation', recording time-stamped translation process data such as users' keystrokes, propagation of TM suggestions, propagation of MT output, and incorporation of term suggestions.The data were originally created as logs by the iOmega-T tool during a commercial translation of English software documentation into Japanese for production in 2014.Although this kind of repetitive content has been considered suitable for computerassisted translation, research has so far been sparse "in the context of the IT industry" (Tatsumi 2010: 29).This study puts the data to secondary use, and is presented with the caveat that the size of the corpora studied is limited (168 translation units (TUs) in total plus 59 MT proposals and 43 TM matches), and the number of translators is likely to be few (the data are anonymous as translator IDs were not recorded with the translation process data).
Temporal effort for each TU was measured in words per second, and the TUs were divided into five categories of processing speed, each with 33 to 35 TUs.As an exploratory study, we took a grounded theory approach in order to 'let the data speak', and examined the data in a fine-grained manner to find any patterns that we considered interesting.Following some examination, we chose to focus on the provenance of the target text, i.e. whether the translation suggestion came from TM or MT (see Section 3.1), and the following linguistic features: conditional and temporal expressions (see Section 3.2), processing of demonstrative pronouns (e.g.words such as this, that, or it, used to indicate which entity is being referred to; see Section 3.3), and finally, particles, word order and prepositions (see Section 3.4).The TUs were analysed to find whether the features listed above had any relationship with processing speed as per the five categories.The source, target, and TM-(or MT-) suggested output analysis focused on the extent of the edits from TM/MT to target text based on these linguistic features.As this is an exploratory study, these features were annotated and enumerated without any prior hypothesis, although based on research by Alves et al. (2014), we expected to find a relationship between temporal effort and conceptual/procedural encoding, as we discuss in 3.2.
The data were recorded in two sessions: one log contains 126 translation units, and the other contains 73 translation units.Several of the translation units were incomplete, and others were rejected as outliers.The first ten segments of Log 1 were translated very slowly as the translator/s became accustomed to the interface, and were rejected as a result.Other segments were rejected if they took longer than an upper limit of 300 seconds to translate, as we assumed that the translator left the segment open and moved on to something else.This left 168 segments in the study: 59 post-edited from MT proposals, 43 translated with assistance from TM matches, and 66 translated from scratch.Using common corpora analyses (such as the type/token ratio of lexical variation) in the WordSmith WordList tool, the mean segment length was 20.89 words, with a mean word length of 4.86 characters.

Results
For each segment, we calculated the temporal effort in words per second as described in Section 2. The average processing speed for the whole project was 0.28 words/second.This result compares favourably with previous research (Moorkens/O'Brien 2015) that contrasted expert and novice rates of productivity for English-German post-editing (a process more speed-focussed than the mix of post-editing, TM editing, and translation from scratch in the current study) and found an average expert rate of 0.39 words/second, and average novice rate of 0.13 words/second.In order to investigate whether there is a relationship between translation process and processing speed, we divided the 168 segments into one of five roughly equally-sized categories of temporal effort on the basis of the words/second results as shown in Table 1 TUs were annotated based on features listed in Section 2, which suggest a relationship with processing speed as categorised in Table 1.These are presented in the following sections.

MT and TM suggestions
As shown in Table 1, segments processed more quickly tend to have been translated with the assistance of TM.This is probably to be expected, as TM matches would only be presented to the translator if there were few differences between the project source text (ST) and the TM ST (depending on the fuzzy match threshold), and thus require few changes.While the average processing speed overall was found to be 0.28 words/second, the average speed for segments translated with TM was 0.42 words/second.For segments post-edited from MT, the average speed was 0.23 words/second, and for those translated from scratch, 0.24 words/second.Of the 25 segments processed most quickly, 15 showed no edits from the TM or MT, and five showed a minor change of a single word (such as the decision to use the borrowed word for 'triad' rather than the kanji compound).Helpfully for the translator, the changes required in the TM suggestion are likely to be indicated within the user interface.This tendency is also clear in Figure 1, which shows the percentage of segments in each category translated with TM (including a trend line) and those postedited from MT.The relationship between MT output and categories of temporal effort is less clear.It has been noted (in Moorkens/O'Brien 2015, for example) that the variable quality and lack of confidence estimation for MT output mean that the user needs to maintain constant vigilance when post-editing MT.This means that the decision to retain the MT output for editing and to decide what to edit takes time, particularly for mid-ranking MT outputs, which require substantial editing (De Almeida 2013, Moorkens/Way 2016).

Processing of conditional and temporal expressions
The nature of the ST meant a prevalence of conditional subjunctives in the ST (if-or when-phrases).These are translated using the conditional and temporal in Japanese (baai, tara, to, toki).In relevance theory, some expressions are considered to not only encode concepts, but also an orientation for argumentation or inference (see Ducrot 1980).For example, Wilson (2011) illustrated how the natural language conditional if encodes both a concept and a procedure.She argued that if in the utterance If P then Q encodes the concept IF, which provides access to logical inference rules (or procedures) such as Modus Ponens.The procedural constraint activated by the use of if guides the hearer to recover the antecedent P. Similarly, Takeuchi (2015) described the Japanese conditional tara as an encoding procedure.However, not all expressions are explicitly analysed in terms of the procedural-conceptual distinction.Expressions such as toki or baai are nouns that exhibit some features of conceptual encoding, as these are accessible (i.e. they can be brought to consciousness) and can be paraphrased using different conceptual expressions (see Wilson/Sperber 1993, Blakemore 2002 or Carston 2016 for fuller discussions on the characteristics of procedural encoding).Nevertheless, it is clear that these expressions are all concerned with an argumentative or inferential orientation and therefore, are likely to be located at the border of the procedural-conceptual distinction.Indeed, this particular distinction seems to affect the amount of temporal effort required.As we can see in Figure 2, the presence of temporal or conditional expressions seems to be associated with higher temporal effort, with the highest prevalence of these expressions (whether manually added or present in the MT or TM suggestion) in the categories with the slowest processing speed (19 in Category 1, 17 in Category 2, 12 in Category 3).This is particularly prominent when such expressions had to be manually inserted by translators, with the number per category closely following the trend line in Figure 2. We have also checked whether the presence of a conditional or temporal phrase in the ST made any difference to temporal effort.Interestingly, we did not see any related change in processing speed.This is perhaps because in most cases, conditional or temporal expressions were inserted by the translator when the ST had a heavy NP subject, which required a substantial edit.

Processing of pronouns
Let us now consider the case of pronouns.Pronouns are representative cases of procedural items.
Scott (2016) illustrated how pronouns encode procedure rather than concept, and they instruct the hearer to identify the intended referent.4As mentioned in Section 1, the use of pronouns in Japanese and English is radically different.It is often the case that Japanese does not require explicit subjects, whereas English almost always requires subjects to be made explicit.As a result, demonstrative pronouns in English such as this and that, or personal pronouns such as you are often omitted or rephrased with a full noun-phrase in the Japanese TT.Indeed, in our data we found most pronouns in the ST were either omitted or supplemented using a full-noun phrase in the TT.
That is, the translation of pronouns in the ST does not necessarily result in another procedural entity in the TT.As we can see in Figure 3, the presence of demonstrative pronouns seems to be associated with temporal effort, with fewer being added or removed in categories where processing speed is fastest.Similarly, when a higher number of such expressions are found in the ST, processing speed appears to be slower, as shown in Figure 4.The contrast between overall processing speed and processing speed for segments that contain pronouns is worth noting.As we can see in Figure 5, average words/second for MT was 0.230, and for translation from scratch was 0.237.However, if we focus on segments with pronouns in the ST, we can see that the processing speed for MT dropped to 0.179 and for translation from scratch dropped to 0.143.This suggests that translation bordering on the procedural-conceptual distinction seems to influence processing speeds negatively.

Particles, word order and prepositions
Our expectation based on Alves's work (2013Alves's work ( , 2014) ) was that we would find a relationship between processing effort and the procedural/conceptual distinction.However, the more fluid distinction between conceptual and procedural encoding in our language pair made finding a clear relationship with processing effort more complicated.This is not particularly surprising, as English and Japanese have very different clusters of procedural items.However, when we focus on particles in Japanese, the results are interesting.Japanese uses a complex system of particles/postpositions, most of which are markers for case or other grammatical features.Particles in Japanese have traditionally been analysed as ji (function morphs).First, we focused on three of these particles (wa, ga, o), which are marked by word order in English.When we compare the results in each category, the difference might not be particularly noticeable, as demonstrated in Table 3.However, once we compare the relationship between the number of particles inserted by the translator in the least productive categories (1 to 3) as opposed to the more productive categories (4 and 5), the association with decreased temporal effort becomes very clear, as we can see in Table 4. Second, in order to compare the temporal effort necessary for the translation of procedural encoding, we focused on prepositions in the ST (expressions such as with, to, at, in, on in English), which were translated into particles such as ni, de and to.In Section 3.3, the results presented in Figure 5 indicate that there is a relationship between the presence of pronouns in the ST (whether post-editing MT or translating from scratch) and temporal effort.When pronouns are present in the ST, the processing speed is reduced.We found a similar pattern here with prepositions and temporal effort.Examples such as the straightforward replacement of a preposition in the ST with a postposition particle in the TT and agreement with particle use in the MT/TT suggestions result in decreased temporal effort.In contrast, the translation of a preposition in the ST to a different phrase (often conditionals or other conjunctures) in the TT and disagreement with MT/TM suggestions result in increased temporal effort.As we saw earlier, the average processing speed in words/second for TM was 0.420, for MT was 0.230, and for translation from scratch was 0.237.When we isolated segments with prepositions in the ST, processing speeds dropped to 0.249 for TM, 0.193 for MT and 0.176 for translation from scratch.This suggests that the presence of procedural items, whether in the ST or the TT (or both), reduces processing speeds (see Figure 6).In addition, we found that when the translator chose to translate a preposition in the ST without using a postposition/particle in the TT, the processing effort increased, with average processing speeds reducing to 0.18 for MT and 0.13 for translation from scratch.5

Discussion and Conclusion
The scarcity of analyses of translation within the framework of relevance theory, particularly in the language pair of English to Japanese, meant that this study was necessarily exploratory in nature.As far as we are aware, there have not been any such previous studies using data from translations in an authentic setting.The findings in Section 3.1, wherein productivity appeared to increase when the translator was presented with more TM matches, were nonetheless unsurprising, and confirm the usefulness of TM suggestions with an appropriate match threshold.That TM matches were suggested for 25.6% of the source segments is in keeping with previous research (such as Moorkens/Way 2016), although in this study the matches were of high quality and required relatively few edits.Less expectedly, the average processing speed of post-edited segments was slower than translation from scratch, suggesting that the use of MT in this translation was of little benefit.Tatsumi (2010: 192) noted that "Japanese MT output often requires more extensive PE [post-editing] in general" compared to western European languages.The absence of the expected clear relationship between procedurally/conceptually encoded words and temporal effort was partially due to the language pair, as procedural items in the ST were not always directly aligned with similar items in the TT.The evolution of the distinction between conceptual and procedural encoding in contemporary relevance theory publications (see Wilson 2011, Carston 2016) means that isolating procedural items for analysis has become more difficult, and utterances may instead be considered to exist on a continuum between procedural and conceptually encoded meaning.
The linguistic features analysed (conditional and temporal expressions; demonstrative pronouns; particles, word order and prepositions) may be considered close to the procedural end of this continuum, the processing of which was associated with an increase in temporal effort.Segments that involved the processing of particles, pronouns, and conditional expressions, particularly when these items had to be manually added or removed in the TT, were found to have substantially lower productivity rates.Omissions, particularly of pronouns, have previously been commonly found in English to Japanese machine translations (Tatsumi 2010), necessitating manual addition by the translator/post-editor.We hope that, in this study, we have added a useful application of relevance theory to translation, while evincing the need for a more fine-grained analysis within this language pair, particularly due to the fluid distinction between the two kinds of encoded meaning in relevance theory.
While this work is not suggested to be generalizable (see Section 2), we are satisfied that our findings form a firm basis for some follow-on research.Accordingly, we intend to expand the study to analyse data from a broader group of translators, focusing on the linguistic features analysed that appear to show a relationship with temporal effort in this study.We also intend to include data from an unrelated language pair for comparative purposes, most likely in English to German.In doing so, we hope to shed new light on linguistic problems in computer-aided translation, and to add to the small amount of research on relevance theory and processing effort in translation.

Figure 1 .
Figure 1.Categories of Temporal Effort and the Percentage of Segments Translated using TM and MT

Figure 2 .
Figure 2. Categories of temporal effort, number of segments containing conditional expressions in the MT and conditional expressions added manually at the post-editing stage

Figure 3 .
Figure 3. Categories of temporal effort and addition or removal of demonstrative pronouns

Figure 4 .
Figure 4. Categories of temporal effort and demonstrative pronouns in ST

Figure 5 .
Figure 5.Comparison of average processing speed for translation type and average speed for segments with pronouns

Figure 6 .
Figure 6.Comparison of average processing speed (words/second) for translation type and average speed for segments with added postpositions

Table 1 .
. Categories of Temporal Effort and Related Prevalence of MT and TM Segments in each category exhibited similar characteristics (well within the standard deviation for each text) in a Wordsmith analysis.Categories 1 (slowest) and 5 (fastest) had the lowest number of words per segment and Category 1 also had the highest type/token ratio, i.e. the fewest repeated words, as shown in Table2.

Table 2 .
Wordmith Analysis of Data by Category

Table 3 .
Particles inserted by translator for each category (from slowest to fastest)

Table 4 .
Particles inserted by translator for combined categories