Social Interaction. Video-Based Studies of Human Sociality.

2022 Vol. 5, Issue 1

ISBN: 2446-3620

DOI: 10.7146/si.v5i2.130867

Social Interaction

Video-Based Studies of Human Sociality

Multimodal Word-Search Trajectories in L2 Interaction:
The Use of Gesture and how it Changes over Time

Klara Skogmyr Marian & Simona Pekarek Doehler

University of Neuchâtel


This paper investigates the temporal dynamics of bodily and vocal conduct in the course of L2 word-searches. Based on a longitudinal dataset of L2 French conversations, we first identify a recurrent multimodal search-trajectory involving specific simultaneous and successive assemblies of hand movements/holds with gaze, and (para)verbal displays of ongoing search. We interpret these Gestalt-like trajectories as part of methodic practices through which speakers both account for breaks in progressivity and display their search as “solitary”, preempting recipient’s entry into the turn-in-progress. We then put our findings into a longitudinal perspective, showing how features of these assemblies change over time in the developmental trajectories of L2 speakers.

Keywords: word searches, gestures, social interaction, L2 French, development

1. Introduction

Second language (L2) interactions are ordinary interactions, governed by the same generic organizational principles as first language (L1) interactions (Wagner & Gardner, 2004). One aspect of such basic organization is the preference for progressivity of the interaction: “Moving from some element to a hearably-next-one with nothing intervening is the embodiment of, and the measure of, progressivity” (Schegloff, 2007: 15). Repair is a prime example of an interactional conduct that may impede on progressivity. In L2 interaction repair sequences may be not only particularly frequent, but also particularly lengthy (e.g., Brouwer, 2003; Fasel Lauzon & Pekarek Doehler, 2013; Kurhila, 2006). This has specifically been shown to be the case for word searches (Koshik & Seo, 2012). Participants in L2 interactions in particular — but not exclusively — are hence acutely faced with a practical problem of social coordination: how to display an orientation toward progressivity while halting the moving forward of the interaction due to the search for a word (see Dressel, 2020, for L1 interactions). When the principle of progressivity is violated, participants look for explanations for such violations. Therefore, speakers may not only be held accountable for breaks in progressivity but may themselves (preemptively) engage in conduct that accounts for such breaks (Pekarek Doehler & Balaman, 2021).

In this paper, we investigate the multimodal resources L2 speakers deploy along the temporal unfolding of word searches to account for halts in progressivity and to show that their search is “self-directed” (Goodwin & Goodwin, 1986) or “solitary” (Dressel, 2020). In such “solitary searches”, speakers hence publicly display that their individual cognitive search is underway, and they work to preempt rather than to invite co-participants’ (conditional) entry into the turn-in-progress (Lerner, 1996). While research has documented the range of multimodal resources speakers use as they engage in word searches, most prominently focusing on the central role of gaze, we are here concerned with the temporal trajectories of these resources, and specifically gesture trajectories, in the context of solitary searches:

  1. Are there recurrent gestural trajectories that characterize the search process, displaying it as solitary?
  2. How do gestures interface with other resources as the search process unfolds in time?
  3. How do such assemblies of multimodal conduct map onto the various stages of the search, that is, onset, search-process, resolution?
  4. Ultimately, to what extent do the trajectories and the nature of their components change over time and L2 proficiency levels?

In what follows, we first briefly review research on the multimodal accomplishment of word searches (sect. 2) and present our data (sect. 3). We then offer multimodal sequential analysis of solitary searches (sect. 4.1) and identify how gesture interfaces with other embodied and vocal conduct to form recurrent multimodal search trajectories involving temporal relations of both successivity and simultaneity (Mondada, 2018). Findings document a dynamic movement of gestures (in concert with other resources) across different stages of the search, involving hold (in the broader sense of “momentary suspension of movement”, Graziano & Gullberg, 2018: 5), then small hand movements or self-touch, then again hold, and finally notable re-engagement of gesturing. In the second part of the analysis (sect. 4.2), we examine change over time and L2 proficiency levels in search-conduct, specifically evidencing how the nature of gestures — and concomitantly, the nature of word-search practices — changes over time. We conclude by relating our findings to prior research on word searches and L2 gesture use (sect. 5). Overall, the study shows how L2 talk-in-interaction features both generic characteristics of word searches and traits that are representative of particular stages in the L2 developmental trajectory.

2. Background

Word searches have attracted considerable interest in research on both L1 and L2 interactions (on L1, e.g., Dressel, 2020; Goodwin, 1983; Goodwin & Goodwin, 1986; Lerner, 1996; Schegloff, 1979; 2007; on L2, e.g., Brouwer, 2003; Hayashi, 2003; Kurhila, 2006; Koshik & Seo, 2012; see also Uskokovic & Taleghani-Nikazm, 2022/this issue). This body of work has shown that the search itself is accomplished as a three-step process, consisting of the search onset, the search-in-progress, and the search resolution (or abandonment). A word search hence unfolds over time. It is well-established that speakers deploy not only vocal means to show that a search is in progress (e.g., uhm or uh, Schegloff, 1979), but also use their gaze and other embodied resources to display engagement in a word search and to either invite co-participants’ help or to hold the floor in an attempt to solve the search by themselves. Importantly, these various means are distributed in locally relevant ways across the stages of the word search.

The onset of the search has been shown to be typically marked by syllable lengthenings, hesitation markers such as uh, and silences (for L1: Dressler, 2020; Goodwin & Goodwin, 1986; Schegloff et al., 1977; for L2: e.g., Koshik & Seo, 2012; Kurhila, 2006). The search process may perpetuate hesitation phenomena, repetitions, and further silent pauses. Goodwin and Goodwin (1986) have identified speakers’ middle-distance look or “thinking face” (see already M. Goodwin, 1983: 130) as displays of ongoing search, while speakers’ gaze on the recipient may work to solicit help (see also Dressel, 2020; Hayashi, 2003; Koshik & Seo, 2012), just as open calls for help do (M. Goodwin, 1983). Speakers may further index search by means of pragmatic or depictive gestures. Pragmatic gestures are gestures that “are about the process of communication” (Streeck, 2009b: 179; see also Kendon, 2004; Streeck, 2006; 2009a); an open-hand palm-up gesture that is held “frozen” at turn-end is an example of a pragmatic gesture used to invite, or even pursue, recipient response (Streeck 2009a: 175). Speakers may also use pragmatic gestures to “display the search itself” (Streeck, 2009a: 173), but there is little information in the literature about the precise nature of such gestures. Depictive gestures (Streeck, 2009b) or iconic gestures (Kendon, 2004), in turn, depict, represent, or otherwise visually stage some referential content (cf. Lilja & Piirainen-Marsh, 2019). In word searches, speakers may use these to facilitate recipients’ understanding of the sought-for word so as to solicit candidate solutions (Dressel, 2020; Hayashi, 2003; Streeck, 2009a). The search resolution may consist of the speaker’s own production of the searched-for item, possibly delivered with try-marked intonation and often with gaze toward the co-participant (Koshik & Seo, 2012). Alternatively, speakers may produce depictive gestures to embodiedly complete their word search (Hayashi, 2003; Rydell, 2019), or the recipient may offer a candidate, typically with try-marked intonation (Sacks & Schegloff, 1979; Lerner, 1996) calling for ratification.

Regarding L2 interactions, word searches have been discussed as opportunities for learning, as scenes where expertise is enacted and negotiated (Brouwer, 2003; Koshik & Seo, 2012), and as practices that are reflexively related to social relations between participants (Pekarek Doehler & Berger, 2019). Most research has focused on how L2 speakers solicit help from co-participants in the course of the search or provide their own solutions as a “candidate” inviting co-participants’ confirmation (Hosoda, 2006; Koshik & Seo, 2012). Longitudinal work in the field is rare (however, see Hellermann, 2009, on self-initiated repair). In a longitudinal case-study on L2 French, Pekarek Doehler and Berger (2019) show that, over time, the target speaker’s word searches become less disruptive with regard to the progressivity of talk, and that the multi-word expression comment on dit (“how do you say”) progressively routinizes as a floor-holding device indexing cognitive search.

As to gestures in L2 interaction more generally, the literature suggests that L2 speakers gesture more than L1 speakers during disfluent speech (Gullberg, 2011), that depictive gestures often occur in contexts of lexical trouble (Gullberg, 2011), and that beat gestures that are rhythmically coordinated with talk may be used for self-regulatory purposes in L2 interactions (McCafferty, 2006; Steinbach Kohler & Thorne, 2011). Eskildsen and Wagner (2015; 2018) provide evidence for decreased gesture use over time in connection to an ESL speaker’s acquisition of particular linguistic items (under; across) and constructions (SVO patterns involving the verbs tell, ask, say). Graziano and Gullberg (2018), analyzing story-retellings, show that L2 speakers — just like L1 speakers — often drop their hands or hold the gestures when they stop talking: “when speech stops, so does gesture” (p. 13). However, the authors also suggest that “previous studies provide inconsistent evidence on the precise temporal relationship between gestures and (dis-)fluency” (p. 3, our emphasis). As a matter of fact, despite extensive literature on gestures generally, we still know little about the in situ temporal trajectories of gestures in word searches, about the precise nature of these gestures (are they tapping gestures, self-touch, etc.?), and about longitudinal change over time in L2 speakers’ gesture conduct in such searches. These are the issues we seek to address in the present study.

3. Data and method

The study focuses on Aurelia and Malia, two L2 speakers of French who for 15 months (three semesters) participated in a “conversation circle” for L2 speakers at a university in the French-speaking part of Switzerland. Both were upper-elementary level (A2) speakers at the beginning of the recordings and reached upper-intermediate (B2) level at the end (levels estimated based on course certificates, test scores, self-assessment). Both were living in the French-speaking region and working as PhD assistants at the university throughout the recording period.

The conversation circle was organized in collaboration with a French-language institute that held L2 support courses for university students and collaborators. The interactions took the form of coffee break conversations that offered the participants opportunities to practice L2 French in an informal setting. The conversation circle took place every two weeks during the academic year and typically lasted 30 to 60 minutes. Participants were interacting in groups of three or four speakers with similar proficiency levels; Aurelia and Malia were part of two different groups. No instructor or L1 speaker took part in the conversations. This had consequences for the management of repair, as interactional problems had to be solved among L2 speakers. The participants sometimes used English (a language they all spoke or understood to some extent despite their various L1s) as a lingua franca in repair sequences; occasionally, they also relied on expressions borrowed from Spanish or Italian. Over the 15 months, Aurelia and Malia participated in 18, respectively 23, conversation-circle meetings; all of these were video recorded from two different angles.

We used multimodal, longitudinal conversation analysis (Deppermann & Pekarek Doehler, 2021; Wagner et al., 2018) to uncover systematic patterns in Aurelia’s and Malia’s word-search practices over time. Since we were particularly interested in the participants’ practices for displaying individual search and holding the floor, we focused on the management of longer solitary searches; that is, searches in which the speaker does not find the solution after only a short hesitation, such as a syllable lengthening or a micro-pause, but notably, sometimes arduously, works her way through the search process toward search resolution. Although our dataset does not allow for quantification of different types of word searches, as it does not contain all instances of word searches in the recordings, a clear picture emerged in our data: Longer solitary searches occur throughout the recording period with both Aurelia and Malia, but are specifically frequent midway through the recordings, when the speakers have reached approximately lower-intermediate level (B1) of proficiency (sect. 4.1). At this point in the learning trajectory, the speakers typically attempt to find the search-solution themselves. In contrast, in the earlier recordings, at upper-elementary/A2 level, they most often quickly abandon the search or call for recipient’s help (sect. 4.2.1). Finally, in the later recordings, at upper-intermediate/B2 level, they typically resolve searches expediently by themselves, without heavy impediment on the progressivity of talk (sect. 4.2.2).

4. Analysis

In this section, we first examine longer solitary word searches (4.1). We show that participants coordinate precise embodied, paralinguistic, and linguistic means to enter into a search, to display ongoing search and thereby hold the floor, and to complete the search. The way speakers assemble these resources hence materializes as “multimodal trajectories” through which they forge their way across the word search to first account for breaks in progressivity and preempt co-participants’ turn-entry (Lerner, 1996) and then to display the completion as a resolution of the search process. We argue that these represent multimodal Gestalts (Mondada, 2014) in which recurrent simultaneous-successive characteristics of multimodal conduct combine in each single case with ad hoc contingently occasioned features. In a second step, we put our findings into a longitudinal perspective (4.2), relating them to the typical multimodal configuration of searches found at the start and at the end of the recordings, when Aurelia and Malia were less (upper-elementary level, A2) and more (upper-intermediate level, B2) advanced L2 speakers, respectively.

4.1 A multimodal trajectory of solitary searches

We first document a recurrent gesture dynamics of solitary word searches in our data (4.1.1), involving pragmatic gestures that suggest cognitive search, as well as gesture holds, which work in concert with gaze and hesitation phenomena to hold the floor while the speaker attempts to solve the search herself. As point of comparison, we then demonstrate the different interactional consequentialities entailed by the use of depictive gestures in solitary word-searches, showing that such gestures are not an effective floor-holding device even without response-inviting gaze at recipients (4.1.2).

4.1.1 Recurrent gesture dynamics characterizing solitary word searches

Before the start of Excerpt 1, Aurelia has talked negatively about the city of Zürich to her co-participant Rameh, who sits right across from her on the other side of the table (see Fig. 4). After the closing of this sequence (lines 1-4), Aurelia resumes more positive talk about the city, starting to assert that there is beaucoup de (“a lot of”, line 5)…, but then stalls.

Excerpt 1. Shopping

  Open in a separate window

Aurelia enters the word search (line 5) through a syllable lengthening on de:, followed by vocalizations (e:hm: and e::hm) and several pauses (line 6), before completing the search with the paraphrase pour faire shopping (“to do shopping”, line 7). Both syntactic (beaucoup de) and vocal means project more to come and index the speaker’s ongoing search that extends over more than 3 seconds.

These vocal features work in close concert with Aurelia’s bodily conduct. Gazing to her left, averted from her co-participant from just before the start of the new sequence (line 4), she moves her gaze around in empty space in the course of the search, turning it to the right at the first hesitation marker (Fig. 1), left during the second one (Fig. 2), right during the next pause (Fig. 3), and again briefly to the left (line 7). By distinctly not gazing at Rameh, she displays that she is not inviting him to enter the search (cf. Goodwin & Goodwin, 1986). But here gaze is not just still. Rather, the right-left-right-left wavering may be indexically translating “looking for a solution”. It is only when completing her search with pour faire shopping, ending on try-marked (Sacks & Schegloff, 1979) intonation, that she turns her gaze straight to Rameh (line 7, Fig. 4). At this point, the prosodic delivery of the candidate combined with gaze direction concur to invite the recipient’s display of understanding, which Rameh offers immediately (line 8).

The observed vocal and gaze trajectory of the search is articulated by particular hand movements and holds — that is, “momentary suspension of movement” (Graziano & Gullberg, 2018: 5). The gesture dynamics goes from freeze at the very onset of the search and beyond, through short re-engagement of movement during the search process, to hold before the resolution, and then again re-engagement of movement coinciding with the resolution. At the beginning of her turn in line 5, Aurelia lowers her juice bottle to the table. Exactly as she enters into the vocally displayed search, she starts holding her right hand (RH) still on the bottle (Fig. 1). At the second hesitation sound (line 6), she very briefly slides her hand horizontally over the table (Fig. 2) and puts it again immediately in a hold in that position during the extended pause and until her production of the paraphrase (Fig. 3). It is in fact only once she starts to produce the search solution (line 7) that Aurelia suspends her hold: She lifts her hand from the table and waves it in a wobbly movement as she also wobbles her torso to the sides (Fig. 4), possibly to reenact the action of shopping. Note that the verbal delivery of the resolution starts slightly before resumption of co-speech gesture (see Dressel, 2020), which begins just before the speaker’s turning her gaze to the recipient. The very absence of depictive gestures during the search process may be a further indicator of Aurelia not inviting recipient’s entry into the search, while her depictive gesturing at the search resolution (Aurelia’s reenactment) combined with gaze at recipient calls for a response (which Rameh subsequently provides).

Rameh indeed shows his understanding of Aurelia’s engagement in a solitary search by refraining from taking the turn, thereby giving Aurelia time to complete the search herself, and he responds only once Aurelia has produced the candidate solution (co-participation in the search might also be difficult for Rameh, considering the many possible completions to the turn initiation).

The excerpt shows the use of vocal and bodily means to index cognitive search and hold the floor during word searches. While the nature of these means confirms earlier findings (sect. 2 above), the excerpt additionally illustrates how these means are assembled (cf. Goodwin, 2013) along the temporal trajectory of the search in ways that are recurrent in our data. Notable halting of hand movements starts with the vocally displayed initiation of the search and is accompanied by gaze averted from the recipient; gesture is then held still with continued gaze aversion (here: moving left and right through space), hesitation markers, and syllable lengthenings to index the cognitive search process, and resumption of gestures and turning of gaze to co-participant coincide with the delivery of the search solution. So, we observe a complex on-line orchestration — putting into play both simultaneity and sequentiality — of multiple means deployed by speakers over the course of the search-initiation, the search-process and up to its resolution.

Excerpt 2 provides a further illustration. Malia has been telling Zarah (sitting next to her) and Theo (seated on the other side of the table, Fig. 3) about the arduous (and expensive, line 2) process of getting her foreign university certificates notarized. In line 4, she extends her turn with et trè:s (‘and very’), projecting another negative aspect of the process. This is when she enters into a search for an appropriate qualifier. In addition to the resources observed in Excerpt 1, Malia here uses the metalinguistic expression comment dit (a morphophonologically reduced variant of “how do you say”; henceforth CoDi) as a marker of cognitive search:

Excerpt 2. Procès

  Open in a separate window

Again, vocal and embodied means work in concert to mark the search initiation: Malia averts her gaze from Zarah down toward the table and stops her circling gestures (see Fig. 1) to join her hands together in front of her in a hold (Fig. 2) as she produces two sound objects: m- mh: (line 4). On the second of these, Malia raises her gaze slightly into empty space. She maintains this middle-distance look as she produces comment dit (line 5) and scratches her chin (Fig. 3). Repeating the adverbial très (“very”, line 6), Malia then offers a non-lexical vocalization (pfhhh) as she joins her hands into another brief hold and gazes around quickly before looking up toward the ceiling (Fig. 4). With the resolution, Malia emphatically opens her hands, fingers spread, and gazes at Zarah (Fig. 5) when completing the search with difficile (“difficult”, line 6). In line 8, Malia resumes fluent speech and deploys pragmatic gestures that rhythm her high-grade assessment “very very expensive”.

In sum, the excerpt shows a similar vocal-bodily word search trajectory as Excerpt 1, with distinct dynamics of hand movements: gesture movement before search onset, freeze right after onset shortly, short re-engagement of movement (self-touch coinciding with CoDi) during the search, and then again cessation of that movement before the resolution, and finally prominent re-engagement of movement toward recipient at the resolution. This is distinctly coupled with gaze: gaze aversion with the halt of gestures at search onset, maintenance through the process, and then return of gaze toward recipient with resumption of gesturing at the resolution. Note also the parallelism in the nature of the gesturing: Both the waving (Ex. 1) and the emphatic opening of hands and fingers (Ex. 2) are directed toward the recipient and can be seen as indexing the speaker’s “having found something”, or maybe even “offering” a solution (see Streeck, 2009a). Additionally, in Excerpt 2, the cognitive search marker CoDi works as a “filled pause”, holding the floor, but occurs only way into the search process, as an additional means for accounting for the substantial break in the turn’s moving forward by displaying the break as being related to a cognitive search.

Excerpt 3 shows how speakers may pursue their solitary search even if a candidate completion has been offered, seeking to produce a particular linguistic item in the L2 rather than merely warranting mutual understanding. Malia, Zarah (next to Malia), and Theo (opposite Malia) have been discussing the administrative difficulties related to Zarah’s university admission. In lines 1-2, Malia asserts that she thinks that Zarah peux trouver (“can find”), but she has difficulties producing the object of the clause. Despite Malia’s deployment of various non-linguistic means to make recognizable that she is engaged in a word search, Zarah offers the (vague) candidate quelque chose à f- (“something to d-“, line 4). Although Malia first repeats Zarah’s candidate (line 5), she maintains her gaze averted from the co-participants and pursues the search until she finally offers the searched-for word in English.

Excerpt 3. Way

  Open in a separate window

As Malia produces the word trouve:r (“find”, line 2), she gazes and gestures toward her interlocutor (Fig. 1). The entry into non-fluent speech is marked with silence and a hesitation sound and slightly diminished amplitude of gesture, as Malia first bunches her fingers together into a “grappolo” gesture (Kendon, 2004), as if attempting to seize a word, then opens them again, shaking them slightly palms up, and gazes down toward the table (line 2). The search process is displayed through further non-linguistic sound objects interspersed with silence (line 3). Malia, while still gazing down, lowers her right hand and flips it onto the table, palm down (Fig. 2). She then taps her fingers on the table, turns her hand to the side and uses two sets of gestures that seem to index a solitary search (see again Streeck, 2009b: 179 on pragmatic gestures that “are about the process of communication”): Malia rubs the tips of her thumb and index finger in small, repetitive “search” gestures (Streeck, 2009a) (Fig. 3), and then places her left hand against her left temple (Fig. 3) — as if suggesting “thinking” — and maintains it there until she resumes speech (line 5). Despite Malia’s multimodal displays of a solitary search, Zarah offers an unsolicited candidate completion (line 4). Malia first acknowledges Zarah’s candidate (line 5), but maintains her gaze lowered and rapidly stops gesturing by letting her right hand fall to the table and then into her lap and by turning and keeping her left hand in a hold (Fig. 5) as she produces another hesitation sound and pauses (line 6). She thus both gesturally and vocally pursues the search of another word, but eventually completes the search by offering the English-language way (line 6) and pointing forward with her palm up (Fig. 6). Malia then lowers her gaze and places her head in her hands (Fig. 7), displaying frustration with her inability to find the French word, and she eventually looks up toward Theo (Fig. 8), who takes the turn (line 12 and onward).

This excerpt demonstrates again the dynamic trajectory of the various vocal and embodied means speakers use to display their proceeding through the word-search process. Similarly to the preceding excerpts, we have a reduction (though not a complete cessation) of hand movements at the vocally marked search onset combined with gaze aversion from the recipient, which is here, however, followed by a range of pragmatic gestures indexing impatient search (flipping hand, tapping and rubbing fingers), and we observe a short hold preceding re-engagement of gesture concurrent with the search resolution. The shift from one vocal resource to another (or to silence) within the search process is again finely coordinated with a change in embodied conduct, such as a shift from moving gestures to gesture hold. Speakers hence segment the search process itself both vocally and visually, and the nature of their gestures in that process is distinctly pragmatic rather than depictive.

4.1.2 Pragmatic vs. depictive gestures during word searches: an observation

Excerpt 4 confirms, by way of contrast, the relevance of precise gestural conduct for successfully claiming the right to pursue the search on one’s own. Here we see that depictive gestures (i.e., gestures that convey referential content; sect. 2 above) may promote other-completions of word searches even without gaze conduct inviting recipients’ participation, and hence depictive gestures are not an effective floor-holding resource, in contrast to the pragmatic gestures discussed above. Aurelia tells Mia (next to her) and Natascha (opposite her, Fig. 2) about a friend who travels to Spain each time he needs to go to the dentist because it is less expensive than in Switzerland.

Excerpt 4. Braces

  Open in a separate window

Right after the start of the search, Aurelia raises her left hand to her mouth and gazes down toward the table (line 1). While this might at first sight be seen as a self-touching gesture suggesting cognitive search (as in Ex. 2, 3), Aurelia then does small horizontal movements with her fingers across her teeth as she delivers CoDi (line 2) while maintaining a lowered gaze (Fig. 1). She subsequently freezes her hand during the 0.5-second pause, and then taps her teeth slightly at the hesitation sound e::h (line 2), with her gaze still lowered. She hence performs what Goodwin calls an “environmentally coupled gesture” — a gesture that “is tied to the physical, semiotic, social and cultural properties of the environment within which it is embedded” (2003: 24). At this point, Natascha offers the candidate braces (in English, line 3), showing her understanding of Aurelia’s gesture as a depictive gesture conveying information about the sought-for word, which Aurelia confirms (line 4, Fig. 2). In this case, then, the semiotic quality of the self-touching gesture also approximates the target item, and thereby occasions the co-participant’s participation in the searching activity — and this is so despite the fact that Aurelia keeps her gaze averted from the recipient; that is, she does not deploy gaze to invite recipient help.

4.1.3 Summary — I

Our analyses so far have shown how speakers orient to the preference for self-repair (Schegloff et al., 1977) by mobilizing a range of resources to display their word searches as solitary searches, thereby accounting for breaks in progressivity as being search-related, and at the same time preempting recipients’ entry into the turn (Lerner, 1996) so as to solve the search on their own. While the analyses confirm earlier findings, particularly regarding gaze conduct, they also provide new insights into the nature and the dynamic deployment of gestures along the temporal unfolding of solitary searches:

  1. At search onset, vocal hesitation phenomena and speaker’s gaze averted from recipient (cf. Goodwin & Goodwin 1986, inter alia) coincide with suspension (or notable decrease in amplitude) of hand movements.
  2. During the subsequent search process, the speaker’s further vocal and verbal search signs (clicks, pffs, CoDis, syllable lengthenings, (filled) pauses), plus gaze still averted from recipient (cf. Dressel, 2020; Goodwin & Goodwin, 1986; for L1; Hayashi, 2003; Koshik & Seo, 2012, for L2), co-occur with pragmatic gestures conveying cognitive search (cf. Streeck, 2009a). Among the most recurrent search-indexing gestures, we found scratching one’s chin, touching one’s temple, tapping or rubbing fingers.
  3. At search resolution, the speaker delivers a (candidate) solution with gaze returning to recipient (cf. Dressel, 2020; Koshik & Seo, 2012), and this is coupled with a notable resumption of pragmatic co-speech gestures, which are, however, of a different quality than those during the search process: gestures conveying “having found” or “offering” a solution, such as slamming an open hand on the table, spreading fingers with both hands up, or forward-gesturing (see Streeck, 2009a). As seen in Ex. 3, such forward-gesturing may occur even in cases in which speakers orient to the search solution as “less than perfect”.

The above describes a multimodal trajectory that characterizes extended solitary word searches, materializing in simultaneously and successively deployed vocal and bodily-visual resources that demarcate different stages of the search. This dynamic unfolding can schematically be represented as follows:

Figure 1. Schematic representation of the multimodal trajectory of solitary word searches

Figure 1 represents a Gestalt-like trajectory — a methodical arrangement of several resources distributed in time (Mondada 2014) — with fixed and open slots: While the cited trajectory, consisting of simultaneously (vertical: co-occurring) and successively (horizontal: occurring one after the other) deployed multiple resources, shows remarkable consistency in the data, it may on occasion be completed with locally specific features (such as Aurelia’s gaze wandering around, Ex. 1). The very consistency of the observed pattern suggests that it is part of methodic practices through which participants accomplish solitary word searches in their L2.

Noteworthy is the differential deployment of various types of gestures in the course of the search. During the search process, we observed two types of pragmatic gestures: small repetitive “search” gestures (Streeck, 2009a) (Ex. 3) suggesting that there is a problem with speech production (see also Graziano & Gullberg, 2018, on gesturing during disfluency more generally), and self-touching (Ex. 2, 3), more precisely face-touching, of the type that conveys “thinking” (e.g., touching one’s temple) and works to hold the floor in ways similar to the “thinking face” documented by Goodwin and Goodwin (1986) (see GIFs 1-4).

Gif. 1

Gif. 2

Gif. 3

Gif. 4

In contrast, the search resolution is accompanied by other types of pragmatic gestures, namely gestures directed toward the recipient (Streeck, 2009a) that convey speakers’ having found or offering a solution (see GIFs 5-6).

Gif. 5

Gif. 6

All excerpts also included cessation or distinctly decreased amplitude of dynamic hand/arm movement — most consistently at the onset of the search and right before its resolution. Whereas gesture holds have been documented to occur with word searches in general (Dressler, 2020) and specifically with halts in speech production (Graziano & Gullberg, 2018), we find them not only at search onset, but also consistently right before the resolution. These latter holds mark the transition between the search process itself, as displayed by repetitive “searching” gestures, and its resolution, as displayed by “finding” or “offering” gestures. Furthermore, this gestural conduct differs strongly from the gestures used by speakers to solicit help to complete a word search (see below), which typically are depictive in nature, visually representing the searched-for item, or pragmatic gestures directed to the recipient during the search process (such as palm-up/open-hand gestures; e.g., Dressel, 2020). Excerpt 4 allowed us to illustrate the “risk” associated with the use of depictive gestures in what is otherwise designed as a solitary search, as these may favor co-participant turn-entry even if the speaker’s gaze remains averted. Taken together, the above excerpts also demonstrate that the distinction between depictive and pragmatic gestures is relevant to participants: Speakers and recipients orient to these gesture types as doing different interactional jobs in precise environments within the course of word searches.

In what follows, we turn to the question of longitudinal change: Can we observe any change over time and proficiency levels in the participants’ multimodal management of solitary word searches?

4.2 Change over time and proficiency levels in the multimodal trajectories of word searches

As mentioned above, solitary word searches occur throughout our data. They nevertheless vary across the recording period, and hence across speakers’ proficiency levels, in terms of their relative frequency in comparison to collaborative word searches (in which the speaker solicits help from co-participants), in their length, and in the way they are resolved. This, in turn, is reflected in aspects of their temporal multimodal unfolding.

At the start of the recording period, when our participants are at upper-elementary (A2) level, extended solitary searches are rare; instead, searches are typically rather short, ending either in the speaker’s solicitation of help from co-participants (e.g., through explicit verbal requests or gestures that facilitate co-participant turn-entry), or else in self-completion through code-switching to another language or through gestures that depict the searched-for item (for the latter, see Hayashi, 2003; Rydell, 2019). These properties are reflected in the multimodal trajectories of the searches (sect. 4.2.1). In contrast, at the more advanced level reached by our participants toward the end of the recording period (upper intermediate/B2, and above), word searches are frequently successfully self-completed in the same turn, and this often happens expediently. The shorter searches do not require the same elaborate accounting for breaks in progressivity and floor-holding as the extended searches documented above, and this materializes in a notable diminution of vocal and embodied conduct displaying search in the course of the search process (sect. 4.2.2). Quantitative analysis of the precise frequency of particular gestures is beyond the scope and practical feasibility of this study; instead, our observations are based on systematic qualitative analysis of the data.

4.2.1 Upper-elementary level: Rapidly abandoned solitary searches

Excerpts 5 and 6 illustrate the type of word-search practices our participants regularly engage in at the beginning of the recordings, when they show overall upper-elementary (A2) L2 proficiency. Solitary search conduct is most often rapidly abandoned in favor of search solutions offered in a different language or gesturally.

Excerpt 5 contains two word searches: the first one is completed through an approximation based on another language (here Spanish, line 3) together with depictive gesturing and the second one through depictive gestures alone (lines 7-10). Aurelia suggests to Natascha (next to her), Mia, and Suresh (both on the opposite side of the table, see Fig. 6) that university teaching in Switzerland is not focused (line 3) on students but rather reflects a pyramid-like structure (line 10, Fig. 7-8) with the professors at the top (not shown here).

Excerpt 5. Enseignement

  Open in a separate window

Aurelia’s embodied enactment of the search starts off by displaying it as a solitary search but changes course mid-way. At the onset and as part of the process (line 2), vocal signs of search-initiation are accompanied by gaze averted from recipient and a notable hold of hand movements (line 2, Fig. 1-2); within the search process, searching is indexed through gaze conduct (closing of the eyes) and non-depictive gesturing (a downward beat gesture, Fig. 3) that may also serve a “self-regulatory” function (see Steinbach Kohler & Thorne, 2011: 75, and McCafferty, 2006, on rhythmic beat gestures used to self-regulate talk). However, further into the search, its multimodal enactment starts to shape up in a way that strongly differs from the solitary searches observed above: Aurelia produces the candidate enfocué (line 3), gluing a French ending -ué onto the root of Spanish enfocar (“to focus”), coupled with what can be seen as a depictive gesture, as she brings her hands closer together and lowers them in two beats as if enacting “focus” (Fig. 4-5). After Natascha’s display of understanding (lines 5-6), Aurelia completes her TCU with dans l’étudiant (“in the student”, as in “focused on students”, line 7), but she immediately continues with the contrastive formulation c’est plu:s (“it’s more”, lines 7-8) and initiates another search. This time she deploys depictive gestures combined with gaze on recipient that may be designed to recruit recipient’s help (which is however not provided): She offers an embodied completion of “it’s more”’ by positioning her hands at distinctly different heights in front of her (Fig. 6). Natascha again confirms her understanding (line 9), after which Aurelia continues her embodied demonstration by joining her fingertips high in the air and then lowering her hands in a stepwise manner in the shape of a pyramid (Fig. 7-8) ending on comme ça (“like that”, line 10), while gazing to Suresh and Mia as if soliciting their response. This then is met with co-participants’ confirmations of understanding (lines 11-13).

In this excerpt the multimodal search trajectory is hence initially similar to the solitary searches documented above, but changes course rather rapidly as the speaker concludes the search through alternative means, using depictive gestures and Spanish, rather than seeking to find the target item in her L2.

Excerpt 6 shows a case of very early production — coinciding with the very search onset – of depictive gestures that in principle immediately provides opportunities for the recipient to participate in the search (cf. Dressel, 2020: 46; Hayashi, 2003: 117-8). Malia has just assessed her first day of work at the university as “very horrible” and she launches a telling about what happened. She initiates a search for the expression “knock on the door”, starts gesturally enacting the door knocking, and then ends up switching to English (Zarah sits next to Malia; Mariana and Theo opposite her).

Excerpt 6. Knock the door

  Open in a separate window

From its very onset and throughout her search, Malia deploys depictive gestures: She lifts her right hand and twists it in a way that illustrates her opening a door handle (as we later understand based on the subsequent talk, Fig. 1-2), then flips her right hand over and knocks three times on the table during an extended pause (line 2, Fig. 3), possibly depicting the knocking on a door. After 1.1 seconds, she gazes up at her co-participants and offers the English expression knock the door in low volume and with slightly rising intonation — both gaze and intonation concurring to invite a response. She then redoes the knocking, this time in the air (and hence within the range of the spatial coordinates of a door rather than a table) with her gaze still directed toward her co-participants (line 3, Fig. 4), upon which Mariana and Theo confirm understanding (lines 4-5). Thus, Malia’s vocal displays of a search for a French wording is from its very onset coupled with depictive gestures providing affordances for the co-participants to offer a candidate solution, but she rapidly switches to English and abandons the search for an L2 item (the fact that no French-language solution is offered by her co-participants may be related to the fact that they are also only elementary speakers).

As illustrated in Excerpts 5 and 6, the multimodal trajectories of the searches at elementary level are often initially similar to those of more extended solitary searches found mostly at intermediate level, but they tend to be interrupted early on as the speaker abandons the search for an L2 item and instead deploys alternative means for making that target understandable to recipients. The prevalence of depictive gestures is noteworthy, as these enhance understandability of non-target-like search solutions for co-participants, and thereby may either facilitate co-participant turn-entry into the search or complete the search non-verbally (there are several possible reasons for why no L2 candidate is offered in Ex. 5-6 — such as co-participants’ equally low L2 proficiency, or possible orientation to fostering the progressivity of talk; lack of space prevents us from discussing these in detail).

4.2.2 Upper-intermediate level: Rapidly resolved solitary searches

At the other end of the 15-month recording period, when Aurelia and Malia have reached upper-intermediate level (B2), configurations of searches emerge that are brief and successfully resolved in the L2 by the speaker herself, after only a short disruption in progressivity. This change is prominently reflected in the way gestures are used during these searches. In contrast to the solitary searches most typical of intermediate-level proficiency (sect. 4.1), gestures are minimal in nature, although the onset of the search and its resolution show assemblies of multimodal conduct that converge with more extended solitary searches. Contrary to the searches at elementary level (sect. 4.2.1), gestures are rarely depictive. Concurrent with the distributional change in repair practices is also the participants’ routinization of the CoDi expression — at this point, it is deployed as a highly habitualized means for indexing cognitive search and thereby, briefly, holding the floor (see also Pekarek Doehler & Berger, 2019; Pekarek Doehler & Skogmyr Marian, 2022/in press).

In Excerpt 7, Aurelia is talking to Jordan (see Fig. 2) about her frequent traveling between Switzerland and other countries, and she will eventually assert that she does so on purpose because she likes it.

Excerpt 7. Exprès

  Open in a separate window

The excerpt shows a condensed form of the multimodal trajectory observed with the more extensive solitary searches (sect. 4.1). We see again (line 2) the conjunction of hold (starting with Aurelia’s 0.6s pause) and gaze aversion (at e:::h) around the search onset that is maintained into the search process, as well as a re-engagement of manual gestures plus gaze turning to the recipient with the resolution (here, the production of the target-item exprès, “on purpose”, which is received by Jordan with nods, line 3, simultaneously with Aurelia’s integrating it in her next utterance). Furthermore, the quality of Aurelia’s gesture accompanying the resolution — her resolutely placing her hand flat on the table (Fig. 2) — is in line with the type of pragmatic gestures suggesting search resolution documented in Section 4.1. The segment between the onset and resolution of the search, however, differs significantly from what we have observed above: It is notably short, marked here verbally merely by the comment tu dis produced with speed-up of tempo and lower volume, which are typical features of the marker-like use of the expression for displaying cognitive search (Pekarek Doehler & Berger, 2019; Pekarek Doehler & Skogmyr Marian, 2022/in press), and accompanied by maintained gaze aversion and gesture hold.

Excerpt 8 further illustrates such short searches, again showing a CoDi that is here accompanied by a search-indicating gesture. Malia is telling Javier (next to her) and Jordan (on the other side of the table) about a question-answer session that she had with her students the preceding week. As she initiates a specification of the purpose of the session, she enters a brief word search (line 3).

Excerpt 8. Séance

  Open in a separate window

Like in Excerpt 7, the search is verbally marked merely by Malia’s morphophonologically reduced CoDi expression, and it is accompanied by distinct gaze aversion (Fig. 2). Differently from Excerpt 7, the search involves some hand movements, as Malia stops her circling gesture toward Javier (Fig. 1) and lifts her right hand, briefly placing it against her temple (Fig. 2) — a gesture conveying “thinking” (see Ex. 3), while delivering the CoDi. The resolution (lines 3-4) is accompanied by gaze shift plus a palm-up offering gesture toward the recipient (Fig. 3), which can work to invite response (Streeck, 2009a, b). Javier responds by displaying understanding (line 6), after which Malia continues her talk.

In short, Excerpts 7 and 8 illustrate the type of searches that are most typically found toward the end of the recording period, when Aurelia and Malia have reached upper-intermediate L2 proficiency. Similar shorter searches also occur earlier in the 15-month period but are particularly frequent toward the end. Components of the multimodal trajectory remain constant at its onset (vocal signs of word search are coupled with gaze aversion and most frequently gesture hold) and at its resolution (return of gaze to recipient and re-engagement of gesture through pragmatic gesturing in direction of the recipient). In between, however, the search becomes typically condensed, is often very short, accompanied by discreet embodied signs of search, and we observe an increased use of CoDi expressions as routinized markers of cognitive search. Notably, also, the resolution of the search tends not to be prosodically marked as a candidate seeking confirmation.

4.2.3 Summary — II

The longitudinal change in word-search practices presented here involves a gradual transition toward increasingly smooth and successfully self-completed searches, which affects how speakers mobilize particular semiotic resources — such as gestures — in the course of their searches. The continuum of search practices that are typical of the different points in the recording period (and hence of three different proficiency levels) can be schematically summarized as follows:

  1. Start of the recording period (upper-elementary level, A2): Brief solitary searches, abandoned or completed through use of other languages than the L2 and/or depictive gestures that facilitate understanding of the sought-for item and/or invite co-participants’ turn-entry; searches are often collaboratively completed.
  2. Mid-way throughout the recording period (lower-intermediate level, B1): Extended solitary searches are frequent relative to other types of searches. These are accompanied by multimodal resources for floor-holding such as pragmatic gestures and self-touch conveying cognitive search, and gesture holds, but typically not depictive gestures.
  3. End of recording period (upper-intermediate level, B2): Brief, rapidly solved word searches are more prevalent than before, with speakers successfully self-completing searches with L2 resources. This also involves routinized use of CoDi indexing cognitive search, combined with gaze aversion, allowing the speaker to maintain verbal fluency during brief moments of thinking. Pragmatic gestures indexing cognitive search are sometimes used but typically take short, discreet forms.

Our data thus show a gradual redistribution in the use of highly recurrent assemblies of multimodal conduct in the middle (search process) and end (search resolution) stage of word searches. These assemblies are tied to the accomplishment of different practical purposes in the search (holding the floor vs. depicting the searched-for item and/or inviting co-participant turn-entry), as seen in co-participants’ responses to such assemblies.

The findings regarding elementary level speakers converge with Pekarek Doehler and Berger’s (2019) longitudinal case study showing that their target L2 speaker initially tended to break up her searches in medias res or used her L1, while over time increasingly seeking to solve searches with L2 means, either by providing the target lexical item or using paraphrase. They also converge with Hellermann’s (2009) study documenting an ESL learner’s increased self-initiated self-repairs during five terms. The observed rapid abandonments of searches for L2 solutions at elementary level may be symptomatic of these speakers’ understanding that they do not “know” the target lexical item, and hence that it is futile to search for it in their L2 repertoire; the longer solitary searches at lower-intermediate level instead index speakers’ orientations to their increased ability and aspiration to solve linguistic problems on their own. Finally, at even more advanced levels, speakers’ generally higher proficiency and richer linguistic repertoires might lead to fewer longer searches overall — but might also “oblige” speakers to account for even short breaks in progressivity of their otherwise highly fluent talk, for instance through the use of a recognizable word-search expression such as CoDi that clearly marks the activity as a cognitive search.

5. Discussion

This paper set out to analyze the dynamic deployment of gesture as it interfaces with other vocal and bodily-visual conduct along the temporal unfolding of solitary word searches. It also aimed to shed light on longitudinal change in multimodal word-search practices with speakers’ increased L2 proficiency. Focusing on searches that are not immediately solved after just a short hesitation, we showed how gestures participate in recurrent Gestalt-like multimodal trajectories; that is, methodic practices (cf. Goodwin, 2000) through which participants display the search as a solitary search, thereby both accounting for breaks in progressivity and preempting recipient’s entry into the turn-in-progress (sect. 4.1). We also shed light on longitudinal change in such practices with speakers’ increased L2 proficiency (sect. 4.2).

While our findings concur with prior research as regards the types of embodied conduct in word searches that invite co-participant turn-entry (e.g., gaze on recipient, depictive gestures) or, on the contrary, serve the purpose of floor-holding (gaze aversion, self-regulatory gestures, etc.) (Dressler, 2020; Goodwin & Goodwin, 1986; Hayashi, 2003; Koshik & Seo, 2012; Rydell, 2019, among others), they also provide new insights as to the nature and the dynamic unfolding of gestures in such searches. We documented the segmentation of longer solitary word searches into separate steps, in which gesture hold (or clearly diminished amplitude) marks the search onset (conjointly with gaze aversion), moderate re-engagement of gesture is found in the course of the search process, further hold marks transition from search process to resolution, and notable resumption of co-speech gesture (together with gaze on recipient) accompanies the search resolution. Two main types of pragmatic gestures were found within the search process, namely repetitive “searching” gestures and self-touching gestures (typically face-touching), both indexing cognitive search and thereby holding the floor along similar lines as the middle-distance look and “thinking face” described by Goodwin and Goodwin (1986). These differ from the pragmatic gesturing toward the recipient with the resolution of the search, which confer a sense of “having found” or “offering”.

Our findings demonstrate the interactional consequentialities of the concurrent layering of multi-semiotic resources in word searches and of the different temporal affordances of these resources. We have shown that speakers methodically arrange Gestalt-like configurations (cf. Mondada, 2014) of gaze, gesture, and vocal conduct that work in concert to display cognitive search and preempt co-participants’ entry into the search. This was notably confirmed by a case where averted gaze coupled with depictive gesturing did not have a preempting effect. Interactants thus pay careful attention to the precise semiotic nature of not only gestural conduct, but of how it works together with other resources. Therefore, our claim is not that participants orient specifically to the described gesture trajectories, but that they attend to the conjoint on-line deployment of the various resources through which speakers make recognizable that they enter and pursue a search — during which recipients refrain from co-participating in that process — and that they reach a search resolution — to which recipients respond by displaying understanding. The simultaneous use of multiple semiotic resources is made possible by their different temporalities, whereby gestures and changes in gaze direction can be deployed at the same time and also together with vocal resources.

In terms of change over time and proficiency levels, our evidence regarding the longitudinal redistribution of word-search practices contributes to the still limited research on change in gesture use over the course of L2 development. It has been suggested that L2 speakers overall gesture more than L1 speakers (e.g., Gullberg, 2011), and that gestures accompanying the production of precise lexical items may decrease over time (Eskildsen & Wagner, 2015; 2018). Our data indicate that depictive gestures designed to facilitate understanding of the searched-for item lend themselves particularly to the kind of word searches that are prototypical of early stages of the developmental L2 trajectory, in that they may either recruit co-participant help to complete the search or effectively complete the speaker’s turn embodiedly. The fact that longer solitary searches occur frequently around intermediate level and then decrease suggests that pragmatic gesturing during word searches also might increase at certain points in the developmental trajectory and then decrease again at high proficiency levels when speakers engage in briefer word searches. Importantly, it is neither the function nor interactional purpose of gestures that changes: Depictive gestures are deployed for related purposes throughout the developmental trajectory and so are pragmatic ones. The longitudinal change is rather tied to a redistribution of repair practices that involve specific recurrent multimodal packaging — of which gestures are (one) part. More systematic analysis of different types of gestures used in word searches at different proficiency levels is needed to confirm these observations.

Prior research on L2 word searches has mostly focused on interactions between L2 and L1 speakers (e.g., Brouwer, 2003; Hosoda, 2006; Koshik & Seo, 2012; Pekarek Doehler & Berger, 2019). In our data, no participant has an a priori status as language expert. The documented word-search practices reflect this framework, in which the participants have to work out their interactional troubles among themselves in order to get on with the conversation. On the one hand, participants constantly draw on each other’s relative — and cumulative — L2 expertise to reach intersubjectivity and most often they do so successfully, for instance through depictive gestures or code-switching at elementary level. On the other hand, interaction among L2 speakers might also favor speakers’ attempts to solve interactional challenges individually instead of immediately turning to an expert for help. The potential “benefits”’ of conversations between only L2 speakers for purposes of language learning is outside the scope of this study, but our analyses highlight both what elementary-level speakers can accomplish jointly when drawing on a repertoire of diverse resources, and that intermediate speakers typically manage to complete word searches alone if only they are given the opportunity to think for a moment. Of course, such opportunity has to be claimed and accounted for in recognizable and efficacious ways, and this is done precisely through the kind of multimodal Gestalts that we have documented in the study. Ultimately, we thereby hope to have provided further evidence for how “[t]he construction of action through talk within situated interaction is accomplished through the temporally unfolding juxtaposition of quite different kinds of semiotic resources” (Goodwin, 2000: 1490).


Brouwer, C. E. (2003). Word searches in NNS-NS interaction: Opportunities for language learning? The Modern Language Journal, 87(4), 534 –545.

Deppermann, A., & Pekarek Doehler, S. (2021). Longitudinal Conversation Analysis. Research on Language and Social Interaction, 54(2).

Dressel, D. (2020). Multimodal word searches in collaborative storytelling: On the local mobilization and negotiation of coparticipation. Journal of Pragmatics, 170, 37–54.

Eskildsen, S. W., & Wagner, J. (2015). Embodied L2 construction learning. Language Learning, 65(2), 268–297.

Eskildsen, S. W., & Wagner, J. (2018). From trouble in the talk to new resources: The interplay of bodily and linguistic resources in the talk of a speaker of English as a second language. In S. Pekarek Doehler, E. González-Martínez & J. Wagner (Eds.), Documenting change across time: Longitudinal studies on the organization of social interaction (pp. 143–171). Palgrave Macmillan.

Fasel Lauzon, V., & Pekarek Doehler, S. (2013). Focus on form as a joint accomplishment: An attempt to bridge the gap between focus on form research and conversation analytic research on SLA. IRAL - International Review of Applied Linguistics in Language Teaching, 51(4), 323–351.

Goodwin, C. (2000). Action and embodiment within situated human interaction. Journal of Pragmatics, 32, 1489–1522.

Goodwin, C. (2003). The body in action. In J. Coupland & R. Gwyn (Eds.), Discourse, the body and identity (pp. 19–42). Palgrave Macmillan.

Goodwin, C. (2013). The co-operative, transformative organization of human action and knowledge. Journal of Pragmatics, 46, 8–23.

Goodwin, M. H. (1983). Searching for a word as an interactive activity. In J. N. Deely & Margot D. Lenhart (Eds.), Semiotics (pp. 129–137). Plenum Press.

Goodwin, M. H., & Goodwin, C. (1986). Gesture and coparticipation in the activity of searching for a word. Semiotica, 62(1-2), 51–75.

Graziano, M. & Gullberg, M. (2018). When speech stops, gesture stops: Evidence from developmental and crosslinguistic comparisons. Frontiers in Psychology, 9, 879.

Gullberg, M. (2011). Multilingual multimodality: Communicative difficulties and their solution in second-language use. In J. Streeck, C. Goodwin & C. LeBaron (Eds.), Embodied interaction: Language and body in the material world (pp. 137–151). Cambridge University Press.

Hayashi, M. (2003). Language and the body as resources for collaborative action: A study of word searches in Japanese conversation. Research on language and social interaction, 36(2), 109–141.

Hellermann, J. (2009). Looking for evidence of language learning in practices for repair: A case study of self-initiated self‐repair by an adult learner of English. Scandinavian Journal of Educational Research, 53(2), 113–132.

Hosoda, Y. (2006). Repair and relevance of differential language expertise in second language conversations. Applied Linguistics, 27(1), 25–50.

Kendon, A. (2004). Gesture: Visible action as utterance. Cambridge University Press.

Koshik, I., & Seo, M. S. (2012). Word (and other) search sequences initiated by language learners. Text & Talk, 32(2), 167–189.

Kurhila, S. (2006). Second language interaction. John Benjamins.

Lerner, G. H. (1996). On the “semi-permeable” character of grammatical units in conversation: Conditional entry into the turn space of another speaker. In E. Ochs, E. A. Schegloff & S A. Thompson (Eds.), Interaction and grammar (pp. 238–276). Cambridge University Press.

Lilja, N., & Piirainen-Marsh, A. (2019). How hand gestures contribute to action ascription. Research on Language and Social Interaction, 52(4), 343–364.

McCafferty, S. G. (2006). Gesture and the materialization of second language prosody. International Review of Applied Linguistics, 44, 195–207.

Mondada, L. (2014). The local constitution of multimodal resources for social interaction. Journal of Pragmatics, 65, 137–156.

Mondada, L. (2018). Multiple temporalities of language and body in interaction: Challenges for transcribing multimodality. Research on Language and Social Interaction, 51(1), 85–106.

Pekarek Doehler, S. & Balaman, U. (2021). The routinization of grammar as a social action format: A longitudinal study of video-mediated interactions. Research on Language and Social Interaction, 54(2).

Pekarek Doehler, S., & Berger, E. (2019). On the reflexive relation between developing L2 interactional competence and evolving social relationships: A longitudinal study of word-searches in the ‘wild’. In J. Hellermann, S. W. Eskildsen, S. Pekarek Doehler & A. Piirainen-Marsh (Eds.), Conversation analytic research on learning-in-action: The complex ecology of L2 interaction in the wild (pp. 51–75). Springer.

Pekarek Doehler, S., & Skogmyr Marian, K. (2022/in press). On the progressive routinization of L2 grammar-for-interaction: A longitudinal study of comment on dit ‘how do you say’. The Modern Language Journal, 106, S1.

Rydell, M. (2019). Negotiating co-participation: Embodied word searching sequences in paired L2 speaking tests. Journal of Pragmatics, 149, 60–77.

Sacks, H., & Schegloff, E. A. (1979). Two preferences in the organization of reference to persons in conversation and their interaction. In G. Psathas (Ed.), Everyday language. Studies in ethnomethodology (pp. 15–21). Halsted Press.

Schegloff, E. A. (1979). The relevance of repair to syntax-for-conversation. In T. Givón (Ed.), Syntax and semantics, Vol. 12: Discourse and syntax (pp. 261–286). Academic Press.

Schegloff, E. A. (2007). Sequence organization: A primer in conversation analysis. Cambridge University Press.

Schegloff, E. A., Sacks, H., & Jefferson, G. (1977). The preference for self-correction in the organization of repair in conversation. Language, 53(2), 361–382.

Steinbach-Kohler, F., & Thorne, S. L. (2011). The social life of self-directed talk: A sequential phenomenon. In J. K. Hall, J. Hellermann & S. Pekarek Doehler (Eds.), L2 interactional competence and development (pp. 66–92). Multilingual Matters.

Streeck, J. (2006). Gestures: Pragmatic aspects. In K. Brown (Ed.), Encyclopedia of language and linguistics, 2nd edition (pp. 71–76). Elsevier.

Streeck, J. (2009a). Forward-gesturing. Discourse Processes, 46(2-3), 161–179.

Streeck, J. (2009b). Gesturecraft: The manu-facture of meaning. John Benjamins Publishing.

Uskokovic, B., & Talehgani-Nikazm, C. (2022/this issue). Talk and Embodied Conduct in Word Searches in Video-Mediated Interactions. Social Interaction. Video-Based Studies of Human Sociality, 5(1).

Wagner, J., & Gardner, R. (2004). Introduction. In R. Gardner & J. Wagner (Eds.), Second language conversations (pp. 1–17). Continuum.

Wagner, J., Pekarek Doehler, S., & González-Martínez, E. (2018). Longitudinal research on the organization of social interaction: Current developments and methodological challenges. In S. Pekarek Doehler, J. Wagner & E. González-Martínez (Eds.), Longitudinal studies on the organization of social interaction (pp. 3–35). Springer.