Revision of Business Content on Corporate Social Responsibility: Measuring the Impact of Training on the Cognitive Effort of Second-Language University Students

With more and more people interested in how sustainable and socially responsible companies are, the comprehensibility of content on corporate social responsibility (CSR) has become paramount. Producing easy-to-read business content – either by writing it from scratch or revising it – is a cognitively demanding undertaking, especially for second-language non-professional writers. Both formal training and sustained practice can help writers build expertise and, in turn, be considerate of their intended audience. In particular, research on the impact of training has usually yielded positive results when examining the texts produced following specific instruction. However, the extent to which training has a positive effect on the process of writing and revision is still under-researched, especially in second language. To address this gap, we report on an experimental study that examines the impact of reader-oriented training on the cognitive effort experienced by 47 second-language university students when revising CSR content. We adopted a pre-test post-test design, and we used keystroke logging and retrospective interviews to collect data on students' pausing behaviour, use of online sources, and strategies to approach the revision task. Our training seemed to reduce the cognitive effort linked with lexical choices. Furthermore, it provided some students with procedural knowledge on how to approach the revision task in a more efficient way. We also observed a general tendency to rewrite from scratch (rather than revise) CSR content despite the higher cognitive effort required by rewriting. We discuss implications for training, limitations, and future research avenues.


Introduction
Writing and revising texts in order to suit the needs and expectations of the intended reader is a complex task, which tends to be particularly effortful for non-native speakers of a language (Barkaoui 2007). Training and sustained practice can help manage the cognitive effort required (Kellogg 2008). Therefore, in this article, we report on an empirical investigation of the impact of reader-oriented training on the revision process intended in a broad sense, whether it involved starting a text from scratch (rewriting), or modifying an existing text by trying to preserve as much of its original features as possible (narrow revision). Our focus was specifically on the cognitive effort experienced by second-language university students before and after taking part in our (online) training. To this end, we collected data using keystroke logging and retrospective interviews. The reader-oriented training that we developed (i.e. the intervention of this study) revolved around the accessible communication of business content on corporate social responsibility (CSR). The selection of this domain was motivated, on the one hand, by the growing interest of customers and lay people in the extent to which companies engage with socially responsible activities (Golob/Podnar 2019) and, on the other hand, by the low readability of CSR content, particularly in corporate reports (Smeuninx et al. 2020). exercises is asking students to revise a business advertisement or a brochure following plain language guidelines.
As this section has shown, training on reader-oriented communication is relevant to a variety of settings and domains. In the business domain, very little attention has however been devoted to making CSR content easier to comprehend for a non-specialist audience (Rossetti/Van Waes 2022). This gap is surprising when considering the increasing interest in companies' sustainability efforts from a variety of stakeholders (Zamir/Saeed 2020), as well as previous research highlighting the difficulty of CSR content, particularly in corporate reports (Nilipour et al. 2020). Accordingly, in the current study, we selected CSR content as the focus of our training on reader-oriented communication (Section 3.2).

Expertise and language proficiency in revision
The goal of training is to help students/trainees bridge the expertise and knowledge gap (Carter 1990). When it comes to the revision of texts, higher writing expertise levels result in the ability to consider the text as a whole and to make global, structural changes that consider genre, medium, and intended reader Lindgren et al. 2011). Furthermore, while the revision process of experts tends to be recursive and to involve different rounds (Becker 2006), less experienced writers tend to apply local, small changes to texts that preserve original meaning and structure (Faigley/Witte 1981). Interestingly, Whyatt (2018) found that the translation expertise of professionals is also transferable to monolingual paraphrasing tasks as this group produced higherquality texts (in terms of readability, communicative quality, and errors) compared with language students. Wallace/Hayes (1991) ascribe these differences in expertise levels to the way in which revision tasks are represented in the mind of the writer/reviser. Specifically, after delivering 8minute instruction to freshmen on how experienced revisers operate, the authors observed an improvement in revision quality (Wallace/Hayes 1991).
Similar to writing and translation expertise, levels of language proficiency can also determine the amount and types of revisions made. In his review of the research on the revision practices of experienced and less experienced writers in a first language and in a second language, Barkaoui (2007) discusses the dimensions along which these groups differ, namely: revision beliefs; audience awareness; revision time and frequency; revision processes; revision focus; and revision outcomes. In particular, with regard to audience awareness, Barkaoui (2007: 84) reports that: Skilled writers [...] tend to spend more time thinking about the effect they have on their reader, how they wish to present themselves to the reader, what background knowledge the reader needs to have to comprehend their text, and what may interest the reader.
The author points out that these revision practices should be regarded as a continuum, since novice writers can gradually develop their expertise in (second language) writing through practice (Barkaoui 2007). Overall, despite some evidence that certain revision behaviours are transferable from first language to second language, revisions in a second language appear to be more frequent and more time consuming (Hall 1990;Lindgren/Sullivan 2006), with cognitive demands increasing also as a result of task complexity (Xu et al. 2021). Furthermore, native speakers and highproficiency speakers of a second language tend to make more content and global edits, as opposed to the local edits (e.g., spelling or grammar) usually made by low-proficiency speakers (Barkaoui 2016;Zhang et al. 2017).

Managing cognitive effort
The observable differences among various expertise and language proficiency levels are linked with the way in which the cognitive demand imposed by revision is managed. Revising involves planning solutions to problems detected in a text, transcribing those solutions into new text and, if necessary, revising new and existing text (Hayes 2012). In conducting these activities, the reviser has to maintain a mental representation of the content planned by the author (when revising somebody else's text), of the text as it is at any given moment, and of the needs and expectations of the prospective reader within a context (Hayes 2012;Kellogg 2008;Schriver 2012). These actions put a strain on the reviser's working memory, which is responsible for the temporary storage and processing of information during writing and revision (Olive 2012).
The limited working memory resources are easily depleted when used for tasks that have not become automatised enough (as is the case with children struggling with orthography, writers working with texts of an unfamiliar genre/topic, or low-proficiency second-language writers) (McCutchen 2000(McCutchen , 2011. Skilled and experienced writers/revisers, on the other hand, have internalised knowledge and strategies, thus being able to retrieve them from their long-term memory as needed by the task at hand and, in turn, circumventing working memory constraints (McCutchen 2011). However, the types of issues that need to be addressed in a text also play a role, with the notion of revision moving away from mere error correction to include different forms of text improvement (Conijn et al. 2021). In particular, while surface errors of spelling and grammar might require little cognitive effort for experienced writers and/or native speakers, the ill-defined considerations on readership and global text features (e.g., cohesion) will still require more mental resources and more text readings (Piolat et al. 2004). For writers of English as a second language, for example, the use of discourse featuressuch as cohesive devicesmight require specific instruction (Kobayashi/Rinnert 2001).
When confronted with a difficult task, writers and revisers can adopt different strategies to manage cognitive effort. One of these strategies is related to the interaction with the text to be revised. In their model of revision, Hayes et al. (1987) argue that, when a reviser realises that a text contains too many issuesand/or that they do not know how to solve them (e.g., in the case of novices)they might find it sometimes simpler to extract the gist of the text and rewrite it from scratch, rather than addressing individual problems (Hayes 2004). Such rewriting might be local (e.g., paraphrasing individual sentences) or global (redrafting entire sections) (Hayes et al. 1987). Therefore, the authors distinguish between, on the one hand, revision in a narrow sensehenceforth 'narrow revision' (see Section 5.2), whereby the reviser tries to address issues in a text while preserving as much of the original structure and content as possibleand, on the other hand, rewriting (Hayes et al. 1987). We regarded the model by Hayes et al. (1987) particularly relevant for this study since it focuses on the revision of already existing texts, which was the task carried out by our participants (Hayes 2004) (Section 3.1). A similar distinction is also reported in Allal/Chanquoy (2004) between 'editing'as corrections and modifications that do not change the meaning of the textand 'rewriting', which results in transformations of content, organisation, and meaning.
Rewriting -namely, starting a text from scratch rather than trying to preserve as much as possible of the original text -might originally appear as the most efficient option since it allows the writer to avoid the constraints represented by the characteristics of the original text. However, rewriting might also prove unexpectedly challenging for less fluent and less experienced writers (e.g., low-proficiency second-language speakers), who might experience concern for issues of language or genre conventions with which they are not familiar (Barkaoui 2007). To the best of our knowledge, no previous work in the area of writing research has empirically compared the cognitive effort of rewriting vs revising somebody's else text in a second language. However, comparable research related to (machine) translation shows that translating a text from scratchsimilar to the activity of rewritingtends to require more effort than revising the output produced by a machine translation system when the quality of such output is above a certain threshold (Koponen 2016). Therefore, it can be hypothesised thatwhen a text is of acceptable qualityrevising it would require less effort than rewriting it from scratch.

Measuring cognitive effort
Researchers have used and combined various methods to measure the effort involved in (re)writing or revising (in a second language), such as questionnaires with self-reporting, think-aloud protocols, dual task, eye tracking, keystroke logging, and retrospective interviews (with stimulated recall) (see e.g., Kellogg et al. 2007;Lu/Révész 2021;Révész et al. 2019;Xu et al. 2021). In this section, we are going to focus on the two methods that we selected for our study, namely keystroke logging and retrospective interviews.
At the high level, keystroke logging has been used to understand the cognitive processesincluding mental effortinvolved in writing and revision. The analysis of pauses is especially central in most keystroke logging studies (see Hall et al. (2022) for an overview). Pauses (namely interruptions in the flow of typing that can be recorded by keystroke logging tools) have been linked with effortful processes. The analysis of pauses in writing has shown that they occur, for instance, when the author has to plan new content, retrieve existing content from their memory, or revise the text produced so far, among others (Olive et al. 2009). Most keylogging studies have shown that pauses are longer and occur more often at more global text boundaries, involving more high level and complex cognitive processes. Of course, pausing behaviour (pause length and pause distribution) is an indirect indicator of cognitive effort, and we should be aware of the fact that writers might lose focus at a certain moment, feel the need for some 'downtime', or engage in 'sociopsychological' activities (e.g., daydreaming) (Révész et al. 2019). However, there is ample evidence that pause indicators are reliable indicators of cognitive effort.
The need to understand cognitive processes in writing has translated into analyses of pause locations (e.g., within or between words, between sentences, or between paragraphs) and pause length, as well as fluency, text elements revised, reading activity, and use of sources, among others (Chukharev-Hudilainen et al. 2019;Révész et al. 2019;Van Waes/Leijten 2015). Moreover, findings from previous research have shown that text production in second language tends to be slower than in first language, involving more pauses and decreased fluency (Chukharev-Hudilainen et al. 2019). Additionally, second-language text production tends to involve a more varied use of sources, which has an impact on the fragmentation of writing/revision processes and can increase cognitive effort .
Retrospective interviews have also been used to collect data on multiple aspects related to writing and revision, such as reasons behind revisions (in a second language); use of feedback while revising; and explanations of pausing behaviour (Shen/Chen 2021;Suzuki 2008;Zhao 2010). For instance, the less-skilled writers in the study by Shen/Chen (2021) reported struggling with linguistic encoding (spelling, grammar, or vocabulary).

Addressing a research gap
As this review of previous literature has shown, adapting texts to the intended audience is an effortful task, particularly for less experienced and second-language writers/revisers. Training and practice can help bridge the expertise gap. However, the effect of training on the revision process carried out in a second language is still an under-researched area. For this reason, this experimental study sought to answer the following research question: What is the impact of reader-oriented training on the revision process of second-language university students? We focused in particular on the cognitive effort of the revision process, as indicated by students' pausing behaviour, use of sources, and strategies to approach the revision task (i.e., narrow revision vs. rewriting). Moreover, regarding the strategies, it should be noted that, throughout this paper, we will use the umbrella term 'revision' to include both narrow revision and rewriting.
In addition to filling this research gap, this study contains other elements of novelty. First of all, differently from the majority of previous works, our participants carried out their revisions on already existing texts produced by others, rather than on their own texts. Secondly, this is the first empirical investigation of the differences in pausing behaviour associated with rewriting versus revising in a second language. Thirdly, to the best of our knowledge, this study is the first to focus on the revision of business texts dealing with CSR. In Section 3 below we report on the methodology adopted to answer our research question and to fill these research gaps.

Set-up: participants, data collection, and tasks
We collected data from a total of 47 students from the Faculty of Business and Economics at the University of Antwerp between October and November 2021. Data collection took place in person in computer laboratories on campus. Upon signing up for the study, participants were randomly divided into an experimental group and a control group. Participants in both groups took part in a pre-test session and a post-test session.
During the pre-test session, participants read an information sheet and signed a consent form. Subsequently, they were asked to read an extract of a corporate report produced by a fictitious tobacco company (SmokIT) and to convert it into an easy-to-read and engaging post that could be published on the company's website. We informed the students that they could be creative and inventive with their revisions, that they could do online searches, and that they could take as much time as needed. The revision task took place in Microsoft Word while the keystroke logging tool Inputlog was running in the background. Therefore, we could record the students' typing, pauses, and use of online sources (Leijten/Van Waes 2013). Furthermore, Inputlog saved intermediate versions of the participants' drafts every 2 minutes. Following the revision task, participants were asked to access an online module. The experimental group was assigned a module on accessible communication applied to CSR content, while the control group was assigned a module on the topic of CSR (Section 3.2). Students were instructed to spend at least 45 minutes familiarising themselves with the theory and the exercises in the modules. Finally, at the end of the pre-test session, all participants completed a short demographic questionnaire (Section 3.4).
The post-test sessions took place 2-3 days after the pre-test sessions. As part of the post-test, the students conducted the revision task in the case study of their respective modules. Similar to the pre-test, this revision task involved converting another extract of a corporate report into an accessible and engaging post that could be published on the website of the same tobacco company (see Section 3.3 for details about the texts to be revised). For this task as well, students could be creative and inventive, do online searches, and take as much time as needed. However, differently from the pre-test, in the post-test the participants could also consult their respective modules and use them to guide the revision task. This task also took place in Microsoft Word while the keystroke logging tool Inputlog was collecting data in the background. Finally, all participants completed a fidelity questionnaire with multiple-choice questions about the content of their respective modules. Even though correct answers to multiple-choice questions might be guessed, we treated this fidelity questionnaire as a 'quick and dirty' way of checking that the participants had actually interacted with content from the modules. Since the experimental group and the control group had been assigned different modules, the multiple-choice questions were of course different. Results of the fidelity questionnaire are reported in Section 3.4.
In the post-test session, we also randomly selected a sub-group of participants (3 from the experimental group and 4 from the control group) and we collected screen recorded and interview data from them so as to allow for a more qualitative analysis. The screen-recorded data will not be analysed in this paper. With regard to the interviews, they involved short follow-up questions at the end of the revision task. These retrospective interviews focused on the process/product of the revision task and the potential benefits of taking part in our training. We used semi-structured interviews, which allowed us to discuss the topics of interest but also to adjust the questions to the natural flow of the conversation and to the emergence of additional topics of interest (Matthews/Ross 2010).

Modules
In order to empirically test the impact of training on accessible/reader-oriented communication, the experimental group received a link to an online module where the principles of accessible communication were applied to CSR content. In contrast, the control group received a link to an online module dealing exclusively with CSR 1 . Both modules were hosted by the online writing center Calliope, where content was divided into introduction, theory, exercises, and case study, and students could move freely between the different components depending on their learning styles (Van Waes et al. 2014). Furthermore, in both modules, theoretical content was provided both in textual and in audio-visual format. The revision task carried out by the students in the post-test session was located on the case study page of the respective modules.

Module for experimental group
The theoretical content in the experimental module was divided into three main sections: (i) accessible communication; (ii) CSR; and (iii) the revision process. The section on accessible communication focused on vocabulary, sentence length/structure, cohesion, visual aspects, and relevance. For instance, the students were shown how to simplify sentences, how to use connectives to build cohesion, or how to replace complex vocabulary with simpler synonyms. The section on CSR focused on its definition and on the differences between two channels of communication (i.e., corporate reports and corporate websites), in line with the goal of the revision task (Section 3.1). The section on the revision process relied on Inputlog process graphs and videos in order to show students how expert and novice revisers differ in terms of their revision processes (Leijten/Van Waes 2013). We included this section in order to foster observational learning, whereby students learn by observing the actions and hearing the explanations of experts (Rijlaarsdam et al. 2008). For a more detailed description of the module given to the experimental group, see Rossetti/Van Waes (2022).

Module for control group
The theoretical content in the control module was divided into three main sections, namely definition, evolution, and communication of CSR. The section on the communication of CSR contained links to the websites of companies well known for their CSR activities and for their clear communications. Students in the control group could of course consult these pages but they received no formal instruction on how to write accessible texts.
In the interest of clarity, Figure 1 presents a visual overview of the procedure described in the previous sections.

Materials (texts)
In order to avoid a learning effect (whereby students would learn from the pre-test task rather than from our module/intervention), the texts that we selected for revision in the pre-test and in the posttest differed in terms of topic. However, they both dealt with the socially responsible activities carried out by a fictitious tobacco company called SmokIT. The pre-test text dealt with the company's efforts to reduce their carbon emissions, while the post-test text revolved around the company's investment in smoke-free alternatives. Both texts were extracts of a real corporate report, but we manipulated some of their features to make sure that the two revision tasks were similar and that the texts contained some of the problems mentioned in the Calliope module assigned to the experimental group (Section 3.2). The decision to assign students extracts of corporate reports to be revised into website posts (Section 3.1) was motivated by previous research showing that, while CSR content in corporate reports tends to be too complex and too technical for lay customers, websites offer an opportunity for companies to present CSR information in accessible language and engaging format (Smeuninx et al. 2020;Wei 2020). The length and the original level of readability of the selected texts might have acted as confounding variables. Therefore, we ensured that the two texts were as comparable as possible. Using the Coh-Metrix Common Core Text Ease and Readability Assessor (T.E.R.A.), we observed that both texts had a low level of narrativity (which means that they contained a low proportion of common words and verbs, and a high proportion of complex noun phrases), with the pre-test text having a particularly low percentage (Jackson et al. 2016). Both texts were also similar in their levels of syntactic simplicity (low, especially in the post-test) and referential cohesion (average, but lower in the pre-test text). Referential cohesion refers to the repetition of words, phrases, and concepts across sentences (Jackson et al. 2016). Deep cohesiondetermined by the amount of connections among the events and ideas in the text (Jackson et al. 2016)was higher in the pre-test text, but the narrativity and referential cohesion were lower in this text so we assumed that these readability features would compensate each other. Regarding text length and structure, the two texts had a very similar number of words (274 vs. 278 words) and they were both divided into three paragraphs. Furthermore, they both lacked (sub-)headings and visual devices to support the processing of content.

Demographic profiles and interaction with modules
As explained in Section 3.1, students filled out a demographic questionnaire after the pre-test session, and a fidelity questionnaire after the post-test session. The goal of the demographic questionnaire was to ensure that participants in the experimental and in the control group had similar background characteristics and prior knowledge. Furthermore, participants self-reported how much they had learnt from the respective modules. The goal of the fidelity questionnaire, on the other hand, was to more objectively assess participants' levels of interaction with the modules by asking them 4-5 multiple-choice questions about the theoretical contents. We assigned a score of 1 to correct answers and a score of 0 to wrong answers. 'I don't know' answers were treated as missing values. We calculated the mean score for each question, and then the grand mean of all questions. In the interests of clarity, we report these data as percentage values. Table 1 contains an overview of demographic and fidelity data. The control group was not asked the questions about pre-module and post-module knowledge about accessible communication. Therefore, we marked those cells as 'N/A' in Table 1.
It can be observed that participants in both groups were similar in terms of age, gender distribution, linguistic profile (most of them reported Dutch as native language), average number of years spent studying English (the language of the texts to be revised), and current programme of study (specifically, most of the participants were starting a Master in Multilingual Professional Communication). Furthermore, the majority of them had not previously attended training on accessible communication. However, in the experimental group, most participants were already somewhat familiar with the principles of accessible communication even before looking at the assigned Calliope module. On average, they gave quite a high score about their ability to write accessible/easy-to-read texts following participation in our module. With regard to CSR, in both the experimental and the control group, there was some increase in their self-reported knowledge about the topic after looking at the modules, although with quite some within-group variability. Finally, the grand mean scores from the fidelity questionnaire seem to indicate that participants in both groups engaged thoroughly with the theoretical contents provided in the modules.
Female=20 Male=3 Dutch as native language N=24 N=21 Additional native language(s) Tamazight The results of the fidelity questionnaires were also supported by the keystroke logging data provided by Inputlog. Specifically, the tool recorded the time spent by the participants on their respective modules during the pre-test session, after the revision task (Section 3.1). We observed that the experimental group spent on average 46 minutes on the module, while the control group spent on average 34 minutes, possibly because of the more limited theoretical content (the control group did not have access to theory on accessible communication).

Keystroke logging data
The analysis of keystroke logging data was first carried out in Inputlog and then transferred to SPSS. It revolved around pausing behaviour and source use. Furthermore, we classified participants' overall revision strategies as either narrow revision or rewritingfollowing the classification reported in Hayes et al. (1987) and analysed the extent to which these different strategies were represented by keystroke logging data. In order to examine the effect of our reader-oriented training, for each variable of interest, we compared the pre-test with the post-test data of all participants (within-subjects), as well as the experimental group with the control group at the pre-test and post-test stage (between-subjects). A series of Shapiro-Wilk tests showed that the data were not normally distributed for all the (sub-)variables of interest. Therefore, in the interests of consistency, we ran non-parametric tests (Mann-Whitney U test for the between-group analyses and Wilcoxon signed rank test for the within-group analyses).

Data preparation
Before starting data analysis, we checked and time-filtered the raw IDFX files produced by Inputlog, as described in . This filtering ensured that we removed noise and irrelevant data (such as the time spent between our start of Inputlog and the start of the revision task by the participants). Firstly, we used Inputlog to produce a general analysis of all the raw IDFX files. In the general analysis, every row corresponds to a log event (keyboard or mouse action), for which cursor position, document length, and pauses are reported, along with the environment (focus) in which that event takes place (e.g., main Microsoft Word document, URLs of online pages, Calliope modules, and so on) (Leijten/Van Waes 2013). Based on the information reported in the general analysis files, we filtered the IDFX files from the focus in the main Microsoft Word document immediately preceding the first action (e.g., pressing RETURN or typing a letter) to the last action in the same document, where the revision task took place. Therefore, the filtered IDFX files contained keystroke logging data exclusively on the revision task.
Subsequently, we produced general analyses and process graphs from the filtered IDFXs. Process graphs are visual representations of the writing processes that include information on cursor position, text length at any given point, number of characters typed/edited/deleted, pauses (location and length), and source use (Vandermeulen et al. 2020). In examining all the process graphs and the general analysis files, we noticed thatfor two participants in the pre-test and for eight participants in the post-test 2there were time ranges when Inputlog incorrectly recorded actions as being carried out outside of the Microsoft Word document. In other words, events (typing, revisions, etc.) that were part of the revision task were not recorded as such. However, as the data loss represented only a small percentage of the overall task duration, we also included these IDFX files in the analysis.
Finally, we excluded 4 participants from the pre-test and 2 participants from the post-test because their IDFX files were irreparably damaged, and no data could be retrieved. This exclusion influenced the number of participants that could be included in the within-subjects analyses.

Pause analysis
As parameters in the pause analysis, we used a pause threshold of 200 milliseconds and a P-Burst threshold of 2000 milliseconds. Pauses that are longer than 200 milliseconds are of interest because, for this cohort of participants, they are mainly linked with high-level, effortful cognitive processes (e.g., content planning) that exceed the low-level transitions between keys (Van Waes et al. 2009). P-Bursts are fluent writing episodes that are not interrupted by a pause ). In addition to general measures (e.g., overall task duration, total number of pauses, or total pause time), we examined and compared pausing behaviour at different levels (i.e., within words, between words, between sentences, and before revisions). Furthermore, to account for individual differences in task duration, we included proportional values out of overall process time.

Analysis of revision strategies
Informed by the model discussed in Hayes et al. (1987), we categorised each participant's strategy of approach to the task either as narrow revision or rewriting (despite the fact that both strategies were part of a larger revision task) (Section 3.1). We used the start of the task as the determining criterion in our classification. Concretely, if the first edit (addition, deletion, or substitution) that a participant made was located within the assigned text, we treated the overall strategy as 'narrow revision'. In contrast, if the first edit made was located outside the assigned text (e.g., under it), we treated the overall strategy as 'rewriting'.
When making this categorisation, we excluded outlines, instructions, and titles/headings. For instance, if a participant wrote a title on top of the assigned text but then continued to make the revisions within the text (rather than starting a new one), that strategy was treated as narrow revision. This categorisation was possible because we set Inputlog to save versions of the revised documents every 2 minutes. Therefore, in addition to the final texts produced by the participants, we had also access to their intermediate drafts. This way of categorising the revision strategies is certainly not perfect since participants might have switched from a rewriting to a narrow revision strategy (or vice versa) throughout the task. However, this categorisation allowed us to focus on the way in which these second-language students first interacted with the text to be revised using either of these strategies. Furthermore, as we will show in Section 5.2, this manual classification is partly supported by patterns in the keystroke logging data.
First, we examined if and how keystroke logging data related to pauses reflected the two different strategies of rewriting and narrow revision both in the pre-test and in the post-test. To this end, we merged the data from the control and from the experimental group. Subsequently, we zoomed in on the participants who changed their revision strategies moving from pre-test to posttest session.

Analysis of source use
We also examined how our participants interacted with the main Word document and with external sources (e.g., module, online pages, etc.) during the revision task. The keystroke logging tool Inputlog allowed us to collect data on time spent and on number of keystrokes made in each environment (Leijten/Van Waes 2013). In order to account for the differences in task duration, we report the time and keystrokes per source as a proportion of the overall process times. Prior to data analysis, we recoded the IDFX files that we had filtered (Section 4.1.1) by grouping and categorising the different sources used by the participants. Specifically, we created 8 categories: (i) main document; (ii) language searches; (iii) topic searches; (iv) search start; (v) modules; (vi) session instructions; (vii) calculations; and (viii) irrelevant. A description of the categories is available in Table 2.

Main document
Word document in which students carried out the revision task

Language searches
Internet searches aimed at finding the meaning, synonyms, and/or translation of words

Topic searches
Online searches whose goal was to find information on the CSR topics addressed in the texts (e.g., climate change or smoke-free products)

Search starts
Interruption of the writing flow: typing a search in an Internet browser and scrolling the results without opening a specific page Note: this category includes different types of searches (language-and topicrelated), but we did not assign these searches a specific goal because that was not always clear from the words typed Modules Access to the modules assigned to the participants Note: The data referring to this category are only meaningful in the posttest, since in the pre-test participants were not allowed to access their respective modules while revising

Session instructions
Checking of task instructions provided in digital format

Calculations
Internet searches aimed at calculating values and percentages related to the text

Irrelevant
Use of online pages not related to the task at hand Such categorisation was needed considering the high variability in students' use of sources. To give an example, Figure 2 below contains some of the language-related online searches made by the participants:

Retrospective interview data
In the post-test session, for a randomly selected sub-group of participants (3 from the experimental group and 4 from the control group), we collected qualitative data in the form of short semistructured retrospective interviews. The goal of these interviews was to complement the more quantitative keystroke logging data. Specifically, the questions revolved around the general approach to the task, perceived difficulty, planning and revision strategies, level of satisfaction with the final text, and helpfulness of the Calliope module.
After transcribing the audio recordings, the first author read and coded participants' answers in NVivo using an inductive and iterative approach. The goal of this coding was to identify important units of meaning in the data. After organising the codes into main themes and sub-themes, the first author and the second author discussed the (sub-)themes and reached agreement on their naming and categorisation.
We identified six themes, namely: (i) revision strategies; (ii) interaction with module; (iii) overall approach awareness; (iv) previous knowledge and experience; (v) reader orientedness; and (vi) source use. The theme 'revision strategies' was in turn divided into seven sub-themes, each referring to a different type of issue that the participants addressed in the text when revising it. Most of these sub-themes are in line with the contents of the Calliope module assigned to the experimental group (Section 3.2). Specifically, they refer to vocabulary, syntax, text structure, visual aspects, relevance, tone, and contextual information. The theme 'overall approach awareness' was also divided into two sub-themes, respectively referring to planning how to address the text and to the revision of content.

Results
In this section, we report the results of our quantitative (Sections 5.1-5.3) and qualitative (Section 5.4) analyses. We use the code 'PE' to refer to participants from the experimental group, and the code 'PC' to refer to participants from the control group. Table 3 reports the pause-related results of the between-group analyses comparing the experimental and the control group during the pre-test and during the post-test. The table contains only a subset of all the pause-related sub-variables that we analysed. Specifically, we report the data related to the entire process (e.g., total active writing time) and to the word-, sentence-, and revision-level. The full list of pause-related sub-variables is available in Appendix A.

Pauses
Overall, the pausing behaviour of the experimental group and of the control group was quite similar both in the pre-test and in the post-test, for instance in terms of average duration of pauses, proportion of pauses out of overall process time, number of P-Bursts per minute, and their average duration. In other words, taking part in training did not lead to differences in pausing behaviour between the groups. In the pre-test, the control group showed a significantly higher proportion of between-sentence pauses (compared with the experimental group), which might be indicative of effortful planned sentence production and idea generation (Baaijen et al. 2012;. This difference became non-significant in the post-test, possibly due to acquired familiarity with the revision task and/or with the topic of smoke-free products (Section 3.3).
When focusing on the within-group comparisons, the control group showed a significantly higher proportion of between-word pauses in the post-test (compared with the pre-test) (Z =-2.455, p =.01). This result might be due to the higher cognitive effort needed by this group when making lexical decisions . As the experimental group did not show a higher proportion of between-word pauses in the post-test, we can hypothesize that our training in plain language assisted them with lexical choices, for example in finding lay synonyms of specialised terms. This hypothesis is supported by the keystroke logging data linked with rewriting vs revision strategies (Section 5.2) and by the interview data (Section 5.4).

Pre-test
Post-test  Table 3. Between-group analysis of pausing behaviour in the pre-and post-test

Revision strategies: narrow revision vs. rewriting
The most common strategy involved rewriting (more specifically, redrafting) (Hayes et al. 1987), whereby the participants abandoned the assigned text and started a new text from scratch, either above, below, or in the middle of the assigned text (e.g., when they rewrote one paragraph at a time). Overall, the rewriting strategy was used by 38 participants in the pre-test (90% of the total) and by 31 in the post-test (73% of the total). However, in some (less frequent) cases, the participants decided to revise the given text. This happened in total 4 times during the pre-test and 11 times during the post-test. It is also interesting to note that 6 participants in the control group and 7 participants in the experimental group switched between strategies depending on the session (pre-test vs post-test). Table 4 provides an overview of the strategies and their evolution.  Table 5 shows that, in the pre-test, rewriting involved a significantly higher proportion of pause time and significantly higher proportions of pauses at multiple text levels (within words, between words, and between sentences). These results seem to indicate that rewriting required more cognitive effort than revision as the participants (who were not native speakers of English) had to pay attention to spelling and grammar, come up with their own lexical choices, and generate ideas for the new text ). In the post-test, the differences between rewriting and revision became less pronounced, possibly because the students had familiarised themselves with the task of rewriting a text on CSR in their second language, and/or could draw on a more concrete set of criteria following the training. However, the participants who chose rewriting still needed, on average, significantly more time than the participants who chose the narrow revision strategy, especially because of higher pausing times. It is also interesting to note that, despite the low number of participantsespecially in the narrow revision groupthese keystroke logging data show that using different strategies leads to changes in pausing behaviour. With regard to the participants who showed a change in their revision strategy, some interesting patterns can be observed (Table 4). In the experimental group, the seven participants who changed revision strategies all switched from rewriting to narrow revision. In contrast, the control group showed more variability (3 people switched from narrow revision to rewriting, and 3 people from rewriting to narrow revision). In the experimental group, the change from rewriting (in the pre-test) to revision (in the post-test) resulted in significantly shorter between-word pauses (indicating lower cognitive effort for lexical choices), as shown by a Wilcoxon signed-rank test (Z = -2.366, p = .018). In the control group, such difference was not observed.

Source use
In Table 6, it can be observed that the source-related behaviour of experimental and control group was quite similar across pre-and post-sessions. Overall, participants devoted most of their time and their typing to working on the documents where the revision tasks took place, as would be expected, also based on previous research . For instance, between 90% and 80% of all keystrokesand about 80% of the overall task durationwere devoted to the main document during the pre-test session. These percentages decreased slightly in the post-test, but they continued to represent the majority. Across sessions, the control group dedicated consistently more time and more typing to language searches, and especially during the post-test session. When considering that the control group showed a significantly higher average number and proportion of between-word pauses in the post-test (Section 5.1), this result related to language searches seems again to point to the difficulty experienced by this group in finding appropriate lexical choices. Previous research on source-based writing in English as a second language has also highlighted students' use of online resources such as dictionaries and grammars . Regarding topic searches, the relative time devoted to them increased for all participants in the post-test. Based on the results of the retrospective interviews (Section 5.4), these topic searches were mainly aimed at the addition of contextual information on the topics of tobacco and smoke-free products. We also examined differences within each group. A Wilcoxon signed rank test showed that, compared with the pre-test, in the post-test the experimental group used less time (Z = -2.068, p = .039) and fewer keystrokes (Z = -2.305, p = .021) for starting online searches. Similarly, in the post-test, the control group spent less time starting online searches (Z = -2.052, p = .040), but more time for topic searches (Z = -3.385, p = <.001), compared with the pre-test. With regard to the time spent interacting with the Calliope modules while revising during the post-test session, participants in both groups spent on average between 3 and 4 minutes consulting their respective modules during the revision task. This might seem like a short time, but it should be remembered that participants had already familiarised themselves with the modules during the pre-test session, after completing the revision task (see Section 3.1). Specifically, the experimental group had spent on average 46 minutes on the module, while the control group had spent on average 34 minutes.

Retrospective interviews (post-test)
With regard to revision strategies, participants in both the experimental and the control group reportedly focused to a large extent on the same issues when revising the text. Mentions of vocabulary, syntax, visual aspects, tone, structure, relevance, and contextual information were present in interviews with both groups. For instance, with regard to vocabulary: PC19: [D]ifficult words... I tried to look them up and explain them.
PE10: I would just remove all the jargon and the specific terms used in a report.
As far as visual aspects are concerned: PC22: I used some bullet points as well.
PE10: And what I also did while revisioning my text is use more words that are highlighted so that it's more scannable.
In terms of tone, participants tried to make the text more personal, for example by using exclamation marks or replacing the tobacco company's name with the pronoun 'We'. Syntaxoriented edits involved shortening the sentences. Relevance-oriented comments were quite frequent, but some participants pointed out the difficulty of deciding which information to leave in and which information to leave out when revising: PC13: Like, sometimes I was thinking, like 'Do I have to say this? Is this important? Is this too...?' And that was a little bit tricky for me.
Linked with relevance comments, one participant in the control group and one participant in the experimental group also pointed out the need to add some contextual information, specifically on the available portfolio of smoke-free products. The addition of such contextual information often involved doing online searches on the topic of tobacco and smoke-free products: PE10: When I looked at the case, I saw a link to more information about smoke-free cigarettes, and when I was reading that article, I thought 'Oh, maybe it would be interesting to focus on what the product is, why you will switch to it and then what we do already for those smoke free cigarettes'.
Structure is the textual aspect that received the most attention from participants, who used titles and (sub-)headings to give structure to the text and to merge similar content together. It should be remembered that our texts lacked these elements (Section 3.3): PE23: So this text, I think, some parts were kind of similar. Well, the content was the same. So I put them together and I left out what was extra. [...] When I saw what the paragraph was about, the main idea, I tried to put that into a title and then work on it further because the titles were missing.
The motivation behind the various changes made to the text seems to be reader-orientedness for both the control and the experimental group: PC07: The difficulty may be some of the more technical aspects, how to properly communicate them to an audience that may not be fully familiar with some of the concepts.
PC19: [I]t's easier for the clients to understand them with an explanation, or an example, or something like that.
PC22: I tried to bring as much structure to the text as I could because I think it helps the readers on a website a lot when they're reading a text.
PE10: I just focused on what the products are about and the important information that a customer needs to know to switch, for example, to smoke-free products.
PE23: And then I was searching something about exclamation marks. I was thinking, if that was... If we could add that. But I did because I thought it was a text to the customers and it was on the website. So I thought, well, that it was OK to do that, that it wasn't too, well, business-like.
The fact that participants in the control group showed reader orientedness might be due to their prior knowledge of the topic and experience with plain language, also mentioned during the interviews: PC07: [N]ot to blow my own horn too much, but I do believe that I have a firm grasp of how to write a text that will convince people to at least look at the product that is being sold.
PC19: And second [text] it was easier because I think I know some more about this topic than the other topic.
When prompted by the questions, most participants were able to reflect upon their overall approach to the task, e.g., whether they followed a specific order, why they rewrote a new text from scratch rather than revising an existing one (Section 5.2), whether they already had a structure in mind, and so on. Overall, participants' processes showed features of expertise, for instance, in the use of multiple revision rounds or in the whole-text planning approach: PC19: Sometimes I just write the structure that I want to have in my text. So, for instance, if I want to have a structure that is problem-solution based, I just write these small questions beforehand and then I answer them along the way.
PE18: [B]efore I did anything, I looked at the whole text. And then I started revising and I went... I did it a second time. Like, I wrote a paragraph, revised it, wrote a second, then revised both, wrote three and then revised the whole bit to see if it looked like a whole.
Interestingly, the choice to rewrite a new text from scratch rather than revising the given one seems a way to manage the cognitive difficulty of the task, as highlighted in our review of earlier work (Section 2.3): PC13: [W]hat's very handy for me, was to not adapt the text in the text while I was writing, but just start a new text, and then read. So I was watching and then writing. Sometimes I wrote the same and then I deleted it and then. One striking difference that emerged between the control and the experimental group refers to the sub-theme about planning how to approach the task. Specifically, participants in the control group talked about their difficulty with planning. In contrast, participants in the experimental group seemed to benefit from the module as it provided them with stepwise guidance on how to approach the text: PC19: So I'm actually a bit relieved now, to be honest, because I thought I went through the entire Calliope and I thought to myself 'Where is the part where we learn how to write?' [...] [W]hen I was writing, it was easy going and I found some solutions here and there, but starting the writing process was rather difficult. As the participants in the control group did not receive formal instruction on how to write accessible texts (Section 3.2), it is not surprising that, when interacting with their module, they found the concrete examples of accessible web communications particularly helpful: PC19: [T]here was a lot of theory and it was interesting, but a lot of it I already knew, but I never saw a concrete example, so that was really interesting.
PC22: It helped me that there were some examples of websites that already had a page on CSR. I did use them. I did go back and forth and look at them. And so that helped me a lot.
However, exposure to these examples did not seem to translate into the same stepwise guidance that the experimental module provided.

Discussion and conclusions
We carried out a study to test the impact of reader-oriented training on the revision process. Our reader-oriented trainingthe intervention of this studyrevolved around the accessible communication of CSR content. Using keystroke logging and retrospective interviews in a pre-test post-test design, we collected data on the cognitive effort (pauses, strategies, and source use) of 47 Master students (with English as a second language) revising extracts of corporate reports in order to make them accessible enough for a corporate website. Students were randomly assigned to a control and to an experimental group, with the latter receiving our reader-oriented training.
On average, students had little prior knowledge about CSR (i.e., the topic of the texts). Most of them had not previously attended training on accessible communication. However, based on their academic studies in the domain of professional communication, our participants were between the knowledge-transforming and the knowledge-crafting stage of writing development, characterised by the ability to consider and balance the author's intentions, the features of the text at any given moment, and the reader's interpretation (Kellogg 2008). In other words, our students were already showing some traits of writing and revision expertise. For instance, as highlighted by the interview data, they made different rounds of revisions and they concerned themselves with a number of issues during their revisions, such as vocabulary, syntax, visual aspects, tone, structure, relevance, and contextual information (Becker 2006). Participants went so far as to remove and add content by relying on online searches about the topics of the texts. Such ability to consider global text features is also indicative of their high level of English proficiency. In other words, the students seemed to have automatised procedures to deal with local issues (e.g., spelling and grammar), which left them with enough working memory resources to consider the text as a whole and its intended reader (Barkaoui 2016;McCutchen 2011).
Our training did not have an impact on the general pausing behaviourand, by extension, on the cognitive effortlinked with the revision task. However, it seemed to reduce the cognitive effort linked with lexical choices for the experimental participants in the post-test, as indicated by the lower proportion of between-word pauses in the post-test, and by the consistently lower proportion of time and keystrokes devoted to language searches. For second-language writers and revisers, lexical decisions tend to be determined by level of language proficiency, with more proficient writers/revisers showing a higher degree of lexical sophistication (Kim/Crossley 2018). However, in our study, participants in the experimental and control group were similar in terms of average number of years of English study (and all quite proficient in English, as mentioned above). The fact that our intervention trainingalthough limited in time and scopeassisted students in the experimental group with vocabulary selection is in line with previous research showing that instruction fostering awareness about the external context of a text (including the intended audience) can have a positive effect on second-language writing fluency and lexical sophistication (Yasuda 2011).
In examining the revision strategies adopted by the students, we observed a general preference for rewriting a text from scratch rather than carrying out internal revisions within the assigned text, despite the fact that rewriting was not presented as a strategy in either of the modules. Based on the qualitative comments, the decision to rewrite seems to be determined by the need to manage the complexity of the task and the resulting cognitive effort, as also pointed out in Hayes et al. (1987). In other words, our participants might have perceived the texts as too problematic and/or might have been unsure of how to address the issues identified, thus deciding that rewriting would be the more 'economical' approach. However, keystroke logging data related to pausing showed thatcompared with narrow revisionrewriting actually required more cognitive and temporal effort. Accordingly, our students might have underestimated the effort required to write a semi-specialised text from scratch in a second language (Schoonen et al. 2009). Based on this observation, we contend that there is a need for providing less experienced and/or less proficient writers with training on how to accurately gauge the effort required by revision following a diagnosis of the issues in a text. Such decisions on how to approach a text might be particularly relevant in the workplace (e.g., in a corporate environment), where content tends to be produced and revised under time-and resource-constraints (Schriver 2012).
Some participants also switched strategy (from rewriting to revision, or vice versa) between pretest and post-test. While the control group varied in their strategy switches, the experimental group showed clear patterns of preferring narrow revision to rewriting in the post-test session. This decision allowed experimental participants to save cognitive effort linked with lexical decisions, as indicated by their significantly shorter between-word pauses. The fact that some of the students in the experimental group switched to a less cognitively demanding revision strategy in the post-test might be the result of the module on accessible communication in which they took part. In other words, if we assume that the choice between rewriting and revising is determined by the need to manage the (expected) cognitive effort of the task at hand (Hayes et al. 1987), then the content in the moduleand the exemplary process videos includedmight have shown students how to address issues in a text (e.g. in terms of vocabulary, syntax, cohesion, and so on) without having to start a new document from scratch and, in turn, saving mental effort. The interview data seem to confirm these assumptions since the experimental participants reported that the training provided them with procedural knowledge on how to approach the text, and specifically: (i) knowledge on the textual aspects to be considered; and (ii) criteria against which to assess the quality/accessibility of a text. Such positive impact of training has not always been observed. For instance, Terryn et al. (2017) found that a course on translation revision did not significantly improve their students' procedural knowledge as they did not always adopt the most efficient procedure.

Limitations and future research
This study has some limitations which point to avenues for future research. The relatively small and homogenous group of participants makes our results only partially generalisable to other contexts. Some analyses in particularsuch as the keystroke logging data related to rewriting vs revision strategieswould benefit from a larger sample of participants. Future research should focus on the extent to which the adoption of these strategies influences multiple aspects related to the revision process, from pausing, to source use, to number and types of revisions. Using retrospective interviews or think aloud protocols with a larger group of participants might also provide more insights into the reasons why second-language students prefer one strategy over another (and why they switch between strategies). Our participants were already proficient users of English as a second language and had quite some writing experience. In follow-up work, it might be interesting to test the impact of this (or similar) modules on less experienced/proficient students, and maybe also on expert writers.
We observed that, following participation in the training, the control group carried out more online topic searches, which is likely to be linked with the exclusive focus of their module on the topic of CSR. Therefore, another interesting research avenue would be to investigate if and how training affects source use during writing or revision. Furthermore, asking second-language students to revise texts of a different domain or genre (e.g., financial reports), might complement our data and shed light on other aspects of revision in a second language.
Finally, our focus in this paper was exclusively on the revision process. We are currently analysing the products (or texts) produced by the students from several perspectives to see if differences and similarities in the final texts are reflected in the observed processes.

Authors' note
This study was approved by the Ethics Committee for the Social Sciences and Humanities at the University of Antwerp.

Funding
This project has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 888918.

Appendix A: Full list of pause-related sub-variables examined
Between-group analysis of pausing behaviour in the pre-and post-test Note: There are slight differences in scores between the within-subjects and the between-subjects analysis. These slight differences are due to the fact that, for the within-subjects analysis involving paired observations, some participants had to be excluded (see Section 4.1.1 on data preparation).