Making sounds visible in speech-language therapy for aphasia


  • Sara Merlino Università degli Studi Roma Tre



speech-language therapy, aphasia, pronunciation, instructed vision, auditory and visual resources, multimodality, sensoriality, video-camera framings


In this paper, I analyse video recordings of speech-language therapy sessions for people diagnosed with aphasia. I particularly explore the way in which the speech-language therapists instruct the patients to correctly pronounce speech sounds (e.g. phonemes, syllables) by deploying not only audible but also visible forms of cues. By using their bodies – face and gestures – as an instructional tool, the therapists make visual perceptual access to articulatory features of pronunciation relevant and salient. They can also make these sensory practices accountable through the use of other senses, such as touch. Data was collected in a hospital and in a rehabilitation clinic, tracking each patient’s recovery, and is part of a longitudinal multisite corpus. The paper considers the way in which participants in the therapeutic process use and coordinate forms of sensory access to language that are based on hearing and seeing. It highlights the importance of audio and video recordings to make accessible the auditory and visual details of these sensorial experiences – particularly, proper framings and the complementary use of fixed and mobile cameras.  


Beeke, Suzanne; Beckley, Firle; Johnson, Fiona; Heilemann, Claudia; Edwards, Susan; Maxim, Jane & Best, Wendy (2015). Conversation focused aphasia therapy: Investigating the adoption of strategies by people with agrammatism. Aphasiology, 29(3), 355–377.

Cekaite, Asta (2016). Touch as social control: Haptic organization of attention in adult–child interactions. Journal of Pragmatics, 92, 30-42.

Cekaite, Asta & Mondada, Lorenza (eds.) (2020). Touch in Social Interaction: Touch, Language, and Body. New York: Routledge.

Dunn, Isabelle (2010). The effects of multimodality cueing on lexical retrieval in aphasic speakers. Wayne, NJ: The William Paterson University of New Jersey.

Egbert, Maria & Deppermann, Arnulf (2012). Hearing aids communication: Integrating social interaction, audiology and user centered design to improve communication with hearing loss and hearing technologies. Mannheim: Verlag für Gesprächsforschung.

Ferguson, Alison & Armstrong, Elizabeth (2004). Reflections on speech-language therapists' talk: Implications for clinical practice and education. International Journal of Language and Communication Disorders, 39(4), 469–477; discussion 477–480.

Fridriksson, Julius; Baker, Julie M.; Whiteside, Janet; Eoute Jr., David; Moser, Dana; Vesselinov, Roumen & Rorden, Chris (2009). Treating visual speech perception to improve speech production in nonfluent aphasia. Stroke, 40(3), 853–858.

Goffman, Erving (1981). Forms of talk. Philadelphia, PA: University of Pennsylvania Press.

Goodwin, Charles (1981). Conversational organization: Interaction between speakers and hearers. New York: Academic Press.

Goodwin, Charles (1994). Professional vision. American Anthropologist, 96(3), 606–633.

Goodwin, Charles (2000). Action and embodiment within situated human interaction. Journal of Pragmatics, 32, 1489–1522.

Goodwin, Charles (2003). Pointing as situated practice. In K. Sotaro (ed.), Pointing: Where language, culture and cognition meet (pp. 217–241). Mahwah, NJ: Lawrence Erlbaum.

Goodwin, Marjorie Harness (2017). Haptic sociality. In C. Meyer, J. Streeck, & J. S, Jordan (eds.), Intercorporeality: Emerging socialities in interaction (pp. 73–102). Cambridge, MA: Oxford University Press.

Goodwin, Marjorie Harness & Cekaite, Asta (2018). Embodied Family Choreography. Practices of Control, Care, and Mundane Creativity. New York: Routledge.

Goodwin, Marjorie Harness & Goodwin, Charles (1986). Gesture and coparticipation in the activity of searching for a word. Semiotica: Journal of the International Association for Semiotic Studies/Revue de l'Association Internationale de Sémiotique, 62(1–2), 51–76.

Gullberg, Marianne (2006). Some reasons for studying gesture and second language acquisition (Hommage à Adam Kendon). International Review of Applied Linguistics in Language Teaching, 44(2), 103–124.

Heath, Christian (1986). Body movement and speech in medical interaction. Cambridge: Cambridge University Press.

Heath, Christian (2018). Embodying Action: Gaze, Mutual Gaze and the Revelation of Signs and Symptoms during the Medical Consultation. In D. Favareau (ed.), Co-operative engagements in intertwined semiosis: Essays in honour of Charles Goodwin (pp. 164–177). Tartu: University of Tartu Press.

Heath, Christian; Hindmarsh, Jon & Luff, Paul (2010). Video in qualitative research. London: Sage Publications.

Heritage, John (1984). A change-of-state token and aspects of its sequential placement. In J. M. Atkinson & J. Heritage (eds.), Structures of social action: Studies in conversation analysis (pp. 299–345). Cambridge: Cambridge University Press.

Horton, Simon & Byng, Sally (2000). Examining interaction in language therapy. International Journal of Language and Communication Disorders, 35(3), 355–375.

Jefferson, Gail (2017). Repairing the Broken Surface of Talk: Managing Problems in Speaking, Hearing, and Understanding in Conversation. Oxford: Oxford University Press.

Keevallik, Leelo & Ogden, Richard (2020). Sounds on the Margins of Language at the Heart of Interaction. Research on Language and Social Interaction, 53(1), 1–18.

Kendon, Adam (2004). Gesture: Visible action as utterance. Cambridge, MA: Cambridge University Press.

Kidwell, Mardi (2005). Gaze as social control: How very young children differentiate “the look” from a “mere look” by their adult caregivers. Research on Language and Social Interaction, 38(4), 417–-449.

Kidwell, Mardi & Zimmerman, Don H. (2007). Joint attention as action. Journal of Pragmatics, 39(3), 592–611.

Klippi, Anu (2015). Pointing as an embodied practice in aphasic interaction. Aphasiology, 29(3), 337–354.

Klippi, Anu & Ahopalo, Liisa (2008). The interplay between verbal and non-verbal behaviour in aphasic word search in conversation. In A. Klippi & K. Launonen (eds.), Research in Logopedics: Speech and language therapy in Finland (pp.146–171). Clevedon: Multilingual matters.

Laakso, Minna (2015). Collaborative participation in aphasic word searching: Comparison between significant others and speech and language therapists. Aphasiology, 29(3), 269–290.

Laakso, Minna & Klippi, Anu (2010). A closer look at the ‘hint and guess’ sequences in aphasic conversation. Aphasiology, 13(4–5), 345–363.

LaBonte, Andrew; Hindmarsh, Jon & vom Lehn, Dirk (2021). Data Collection at Height: Embodied Competence, Multisensoriality and Video-based Research in an Extreme Context of Work. Social Interaction. Video-Based Studies of Human Sociality, 4(3).

Lazaraton, Anne (2004). Gesture and speech in the vocabulary explanations of one ESL teacher: A microanalytic inquiry. Language Learning, 54(1), 79–117.

Liberman, Kenneth (2018). Objectivation practices. Social Interaction. Video-based studies of human sociality, 1(2).

Luff, Paul & Heath, Christian (2012). Some ‘technical challenges’ of video analysis: Social actions, objects, material realities and the problems of perspective. Qualitative Research, 12(3), 255–279.

Merlino, Sara (2014). Singing in “another” language: How pronunciation matters in the organisation of choral rehearsals. Social Semiotics, 24(4), 420–445.

Merlino, Sara (2017). Initiatives topicales du client aphasique au cours de séances de rééducation: Pratiques interactionnelles et enjeux identitaires. In S. Keel & L. Mondada (eds.), Participation et asymétries dans l'interaction institutionnelle (pp. 53–94). Paris: L'Harmattan.

Merlino, Sara (2020). Professional touch in speech and language therapy for the treatment of post-stroke aphasia. In A. Cekaite & L. Mondada (eds.), Touch in social interaction: Touch, language and body (pp. 197–223). London and New York: Routledge.

Merlino, Sara (2021). Haptics and emotions in speech and language therapy sessions for people with post-stroke aphasia. In J. S. Robles & A. Weatherall (Eds.), How Emotions are Made in Talk (pp. 233-262). Amsterdam: John Benjamins.

Merlino, Sara; Mondada, Lorenza & Söderström, Ola (forth.). Walking through the city soundscape. An audio-visual analysis of sensory experience for people with psychosis. Visual Communication.

Mondada, Lorenza (2006). Video recording as the reflexive preservation and configuration of phenomenal features for analysis. Video analysis, 51–68.

Mondada, Lorenza (2014). Instructions in the operating room: How the surgeon directs their assistant’s hands. Discourse Studies, 16(2), 131–161.

Mondada, Lorenza (2018a). Visual practices: video studies, multimodality and multisensoriality. In Favareau, D. (ed), Co-operative Engagements in Intertwined Semiosis: Essays in Honour of Charles Goodwin (pp. 304-325), Tartu: University of Tartu Press.

Mondada, Lorenza (2018b). Multiple Temporalities of Language and Body in Interaction: Challenges for Transcribing Multimodality. Research on Language and Social Interaction, 51(1), 85–106.

Mondada, Lorenza (2019). Contemporary issues in conversation analysis: Embodiment and materiality, multimodality and multisensoriality in social interaction. Journal of Pragmatics, 145, 47–62.

Mondada, Lorenza (forth.). Sensing in Social Interaction. The taste for cheese in gourmet shops. CUP.

Mondada, L., Bouaouina, S. A., Camus, L., Gauthier, G., Svensson, H., & Tekin, B. S. (2021). The local and filmed accountability of sensorial practices: The intersubjectivity of touch as an interactional acheivement. Social Interaction. Video-Based Studies of Human Sociality, 4(3).

Nishizaka, Aug (2011). The embodied organization of a real-time fetus: The visible and the invisible in prenatal ultrasound examinations. Social studies of science, 41(3), 309–336.

Nishizaka, Aug (2014). Instructed perception in prenatal ultrasound examinations. Discourse Studies, 16(2), 217–246.

Parry, Ruth (2005). A video analysis of how physiotherapists communicate with patients about errors of performance: Insights for practice and policy. Physiotherapy, 91(4), 204–214.

Pierce, John E.; O'Halloran, Robyn; Togher, Leanne & Rose, Miranda L. (2019). What Is Meant by “Multimodal Therapy” for Aphasia? American Journal of Speech-Language Pathology, 28(2), 706-716.

Pilnick, Alison; Hindmarsh, Jon & Gill, Virginia Teas (2009). Beyond ‘doctor and patient’: developments in the study of healthcare interactions. Sociology of Health and Illness, 31(6), 787-802.

Ronkainen, Riitta Johanna (2011). Enhancing listening and imitation skills in children with cochlear implants-the use of multimodal resources in speech therapy. Journal of Interactional Research in Communication Disorders, 2(2), 245–269.

Rose, Miranda L. (2013). Releasing the constraints on aphasia therapy: The positive impact of gesture and multimodality treatments. American Journal of Speech-Language Pathology, 22(2), 227-239.

Rose, Miranda L. & Attard, Michelle (2011). Multi-modality aphasia therapy (M-MAT): A procedural manual. Melbourne, Australia: La Trobe University.

Rossano, Federico (2012). Gaze behavior in face-to-face interaction. Nijmegen: Radboud University Nijmegen.

Simmons-Mackie, Nina & Damico, Jack S. (1999). Social role negotiation in aphasia therapy: Competence, incompetence and conflict. Constructing (in) competence: Disabling evaluations in clinical and social interaction, 2, 313–341.

Smotrova, Tetyana (2017). Making pronunciation visible: Gesture in teaching pronunciation. TESOL Quarterly, 51(1), 59–89.

Streeck, Jürgen; Goodwin, Charles & LeBaron, Curtis, eds. (2011). Embodied interaction: Language and the Body in the Material World. New York: Cambridge University Press.

Stukenbrock, Anja (2014). Take the words out of my mouth: Verbal instructions as embodied practices. Journal of Pragmatics, 65, 80–102.

Sueyoshi, Ayano & Hardison, Debra M. (2005). The role of gestures and facial cues in second language listening comprehension. Language Learning, 55(4), 661–699.

Svensson, Marcus Sanchez; Luff, Paul & Heath, Christian (2009). Embedding instruction in practice: contingency and collaboration during surgical training. Sociology of Health and Illness, 31(6), 889–906.

Wilkinson, Ray (2004). Reflecting on talk in speech and language therapy: Some contributions using conversation analysis. International Journal of Language and Communication Disorders, 39(4), 497–503; discussion 503–497.

Wilkinson, Ray (2011). Changing interactional behaviour: Using conversation analysis in intervention programmes for aphasic conversation. In Antaki C. (eds), Applied Conversation Analysis (pp. 32–53). London: Palgrave Macmillan.




How to Cite

Merlino, S. (2021). Making sounds visible in speech-language therapy for aphasia. Social Interaction. Video-Based Studies of Human Sociality, 4(3).