Lexical Diversity and Sophistication in Professional Architectural Discourse: A Computational Comparison of Native and Non-native English Writing

Authors

  • Aliasghar Kargar Shiraz University of Arts

DOI:

https://doi.org/10.7146/hjlcb.vi65.152533

Keywords:

Coh-Metrix, lexical diversity, lexical knowledge, lexical sophistication, second language writing

Abstract

The current study aimed to explore the lexical differences between texts authored by native and non-native English professional architects submitted to ArchDaily, the world’s most visited architecture website. The study focused on lexical diversity and sophistication indices, proved to be theoretically and operationally pertinent to L1 and L2 writing discriminations. The corpus of the study comprised randomly selected texts in the category of residential architecture as the commonest instances of architecture projects written by Iranian and British architects. As stated in the website, the texts are authored and revised by the architects themselves and are strongly advised to undergo thorough review and verification for accuracy and quality. The data underwent analysis using Cohmetrix Core Desktop Beta (2023) package, with the results subsequently input into SPSS for further analysis. Preliminary analysis revealed statistical variances between most of the diversity and sophistication indices with supplementary analysis indicating that lexical indices such as word frequency, familiarity, hypernymy, and diversity significantly contributed to discerning between native speaker (NS) and non-native speaker (NNS) compositions. The study concluded that the writing norms of NS and NNS authors within distinct professional communities may not align with the conventional proficient and non-proficient standards typically observed in language studies. The results highlight the significance of adhering to the target community's stylistic conventions, which carries important implications for instructional approaches in ESP and academic writing programs.

References

AR Design Studio. (2025, July 6). The Lighthouse 65. Retrieved from ArchDaily: https://www.archdaily.com/363694/the-lighthouse-65-ar-design-studio?ad_source=search&ad_medium=projects_tab

ArchDaily. (2024). Retrieved from ArchDaily: https://www.ArchDaily.com/content/about?ad_source=jv-header&ad_name=hamburger_menu

Anderson, R. (2014). A parallel approach to ESAP teaching. Procedia-Social and Behavioral Sciences, 136, 194-202. https://doi.org/10.1016/j.sbspro.2014.05.313

Azadnia, M. (2021). A corpus-based analysis of lexical richness in EAP texts written by Iranian TEFL students. Teaching English as a Second Language Quarterly, 40(4), 61-90. https://doi.org/10.22099/JTLS.2021.40043.2960

Carkin, S. (2005). English for academic purposes. In Handbook of research in second language teaching and learning (pp. 85-98). Routledge.

Chen, Y. H., & Baker, P. (2010). Lexical bundles in L1 and L2 academic writing. Language Learning and Technolo-gy, 14, 30-49.

Covington, M. A., & McFall, J. D. (2010). Cutting the Gordian knot: The moving-average type-token ratio (MATTR). Journal of Quantitative Linguistics, 17(2), 94-100. https://doi.org/10.1080/09296171003643098

Crossley, S. A. (2020). Linguistic features in writing quality and development: An overview. Journal of Writing Research, 11(3), 415-443. https://doi.org/10.17239/jowr-2020.11.03.01

Crossley, S. A., & Kim, M. (2022). Linguistic features of writing quality and development: A longitudinal approach. Writing Analytics, 6, 59-93. https://doi.org/10.37514/JWA-J.2022.6.1.04

Crossley, S. A., & Kyle, K. (2018). Assessing writing with the tool for the automatic analysis of lexical sophistication (TAALES). Assessing Writing, 38, 46-50. https://doi.org/10.1016/j.asw.2018.06.004

Crossley, S. A., & McNamara, D. S. (2009). Computational assessment of lexical differences in L1 and L2 writing. Journal of Second Language Writing, 18, 119-135. https://doi.org/10.1016/j.jslw.2009.02.002

Crossley, S. A., & McNamara, D. S. (2011). Understanding expert ratings of essay quality: Coh-Metrix analyses of the first and second language writing. International Journal of Continuing Engineering Education and Lifelong Learning, 21(2-3), 170-191. https://doi.org/10.1504/IJCEELL.2011.040197

Crossley, S. A., & Salsbury, T. (2010). Using lexical indices to predict produced and not produced words in second language learners. The Mental Lexicon, 5(1), 115-147. https://doi.org/10.1075/ml.5.1.05cro

Crossley, S. A., Salsbury, T., McNamara, D. S., & Jarvis, S. (2011). Predicting lexical proficiency in language learner texts using computational indices. Language Testing, 28(4), 561-580. https://doi.org/10.1177/0265532210378031

Crossley, S. A., Skalicky, S., Kyle, K., & Monteiro, K. (2019). Absolute frequency effects in second language lexical acquisition. Studies in Second Language Acquisition, 41(4), 721-744. https://doi.org/10.1017/S0272263118000268

Doró, K., & Pietilä, P. (2015). Researching Vocabulary in L2 Writing: Methodological Issues and. In P. Pietilä, K. Doró, & R. Pípalová, Lexical Issues in L2 Writing (pp. 11-29). Cambridge Scholars Publishing.

Dowell, N. M., Graesser, A. C., & Cai, Z. (2016). Language and discourse analysis with Coh-Metrix: Applications from educational material to learning environments at scale. Journal of Learning Analytics, 3(3), 72-95. https://doi.org/10.18608/jla.2016.33.5

Garner, J., Crossley, S., & Kyle, K. (2020). Beginning and intermediate L2 writers' use of N-grams: An association measures study. International Review of Applied Linguistics in Language Teaching, 58(1), 51-57. https://doi.org/10.1515/iral-2017-0089

Graesser, A. C., McNamara, D. S., & Kulikowich, J. (2011). Coh-Metrix: providing multi-level analyses of text char-acteristics. Educational Researcher, 40(5), 223-234. https://doi.org/10.3102/0013189X11413260

Huang, K. (2015). More does not mean better: Frequency and accuracy analysis of lexical bundles in Chinese EFL learners' essay writing. System, 53, 13-23. https://doi.org/10.1016/j.system.2015.06.011

Hyland, K. (2016). Academic publishing and the myth of linguistic injustice. Journal of Second Language Writing, 31, 58-69. https://doi.org/10.1016/j.jslw.2016.01.005

Jung, Y., Crossley, S., & McNamara, D. (2019). Predicting second language writing proficiency in learner texts using computational tools. Journal of Asia TEFL, 16(1), 37-52. https://doi.org/10.18823/asiatefl.2019.16.1.3.37

Kaivanpanah, S., Alavi, S. M., Bruce, I., & Hejazi, S. Y. (2021). EAP in the expanding circle: Exploring the knowledge base, practices, and challenges of Iranian EAP practitioners. Journal of English for Academic Purpos-es, 50, 100971. https://doi.org/10.1016/j.jeap.2021.100971

Kyle, K., & Crossley, S. (2015). Automatically assessing lexical sophistication: Indices, tools, findings, and applica-tion. TESOL Quarterly, 49(4), 757-786. https://doi.org/10.1002/tesq.194

Kyle, K., & Crossley, S. (2016). The relationship between lexical sophistication and independent and source-based writing. Journal of Second Language Writing, 34, 12-24. https://doi.org/10.1016/j.jslw.2016.10.003

Kyle, K., Crossley, S. A., & Jarvis, S. (2021). Assessing the validity of lexical diversity indices using direct judge-ment. Language Assessment Quarterly, 18(2), 154-170. https://doi.org/10.1080/15434303.2020.1844205

Lei, S., & Yang, R. (2020). Lexical richness in research articles: Corpus-based comparative study among advanced Chinese learners of English, English native beginner students and experts. Journal of English for Academic Pur-poses, 47, 100849. https://doi.org/10.1016/j.jeap.2020.100894

Lenko-Szymanska, A. (2014). The acquisition of formulaic language by EFL learners: A cross sectional and cross-linguistic perspective. International Journal of Corpus Linguistics, 19, 225-251. https://doi.org/10.1075/ijcl.19.2.04len

McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381-392. https://doi.org/10.3758/BRM.42.2.381

McKinley, J., & Rose, H. (2018). Conceptualizations of language errors, standards, norms and nativeness in English for research publication purposes: An analysis of journal submission guidelines. Journal of Second Language Writing, 42, 1-11. https://doi.org/10.1016/j.jslw.2018.07.003

McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated Evaluation of Text and Discourse with Coh-Metrix. Cambridge University Press. https://doi.org/10.1017/CBO9780511894664

McNamara, D. S., Louwerse, M. M., McCarthy, P. M., & Graesser, A. C. (2010). Coh-Metrix: Capturing linguistic features of cohesion. Discourse Processes, 47(4), 292-330. https://doi.org/10.1080/01638530902959943

Nasseri, M., & Thompson, P. (2021). Lexical density and diversity in dissertation abstracts: Revisiting English L1 vs. L2 text differences. Assessing Writing, 47, 100511. https://doi.org/10.1016/j.asw.2020.100511

Nie, M. (2024). A Study on the Lexical Complexity of English for Special Purpose Based on WordSmith Tools and Software Range. In the proceedings of the 2024 4th International Conference on Internet Technology and Educa-tional Technology Educational Informatization (ITEI 2024) (pp. 73-80). Atlantis Highlights in Social Sciences, Education and Humanities. Retrieved from https://doi.org/10.2991/978-94-6463-560-7_10

Pallant, J. (2020). SPSS Survival Manual: A Step-by-Step Guide to Data Analysis Using IBM SPSS. McGraw-Hill. https://doi.org/10.4324/9781003117445

Pérez-Llantada, C. (2012). Scientific Discourse and the Rhetoric of Globalization. Continuum.

Pípalová, R. (2015). Reporting Verbs in Native and Non-Native Academic Discourse. In P. Pietilä, K. Doró, & R. Pípalová, Lexical Issues in L2 Writing (pp. 127-155). Cambridge Scholars Publishing.

Read, J. (2000). Assessing Vocabulary. Cambridge University Press. https://doi.org/10.1017/CBO9780511732942

Reuneker, A. (2017). Lexical Diversity Measurements. Retrieved 10 October, 2025,from https://www.reuneker.nl/files/ld.

Silva, T. (1993). Toward an understanding of the distinct nature of L2 writing: The ESL research and its implications. TESOL Quarterly, 24(7), 657-677. https://doi.org/10.2307/3587400

Forside

Downloads

Published

2025-11-21

How to Cite

Kargar, A. (2025). Lexical Diversity and Sophistication in Professional Architectural Discourse: A Computational Comparison of Native and Non-native English Writing. HERMES - Journal of Language and Communication in Business, (65), 81–98. https://doi.org/10.7146/hjlcb.vi65.152533