Too Big or Not Too Big: Establishing the Minimum Size for a Legal Ad Hoc Corpus
DOI:
https://doi.org/10.7146/hjlcb.v27i53.20981Abstract
A corpus can be described as “[a] collection of texts assumed to be representative of a given language, dialect, or other subset of a language, to be used for linguistic analysis” (Francis 1982). However, the concept of representativeness is still surprisingly imprecise considering its acceptance as a central characteristic that distinguishes a corpus from any other kind of collection (Seghiri 2008). In fact, there is no general agreement as to what the size of a corpus should ideally be. In practice, however, “the size of a corpus tends to refl ect the ease or diffi culty of acquiring the material” (Giouli/Piperidis 2002). For this reason, in this paper we will attempt to deal with this key question: we will focus on the complex notion of representativeness and ideal size for ad hoc corpora, from both a theoretical and an applied perspective and we will describe a computer application named ReCor that will be used to verify whether a sample of legal contracts compiled might be considered representative from the quantitative point of view.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
a. Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
b. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
c. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).