Problem-oriented Corpus Annotation and the Hebrew Bible
DOI:
https://doi.org/10.7146/hn.v5i2.142730Keywords:
corpus linguistics, quality assurance, corpus annotation, HebrewAbstract
In this contribution, I argue that the exegetical and stylistic study of the Hebrew Bible would benefit from the creation and storage of qualitative and quantitative annotations using problem-oriented corpus annotation (de Haan 1984). Within Biblical studies exegetes are used to static interfaces which allow them retrieve information, but not enhance it with anything more elaborate than user notes. I present a roadmap for the development of an annotation tool tailored to the Hebrew Bible with the sole objective of enriching the data that is already present in open source datasets like that of the ETCBC. Based on my experience with a dataset to annotate conceptual metaphors in the book of Job (bibliametaphorica.com) and the literature on corpus annotation (Sinclair 2004; Leech 2005; Fort et al. 2012), I argue that data creation and enrichment is a challenging, yet rewarding endeavour. It is challenging because it is circular, viz. labels are informed by the data are hence difficult to a priori define. Furthermore, it is difficult to be consistent and the actual, manual labelling of the text requires interpretative choices that cause editorial fatigue. Fort et al. (2012) suggest that annotation campaigns have differing degrees of difficulty which can be mitigated not just by inter-annotator rating, but by conscious decisions to lower the annotation complexity. A user interface is needed that limits annotation complexity and that allows researchers to annotate the text with minimal effort. The end-result is an XML document, for instance, that contains both the text and the annotations, in a format that can be merged back into the original database, but need not be. The existence of a tool for the manual annotation of open data will increase the replicability of research as well as its democratisation, as students world-wide can create and share their data.
Downloads
Published
How to Cite
Issue
Section
License
Counting from volume 9 (2024), articles published in HIPHIL Novum are licensed under Attribution-ShareAlike 4.0 International (CC BY-SA 4.0). The editorial board may accept other Creative Commons licenses for individual articles, if required by funding bodies e.g. the European Research Council. With the publication of volume 9, authors retain copyright to their articles and give Hiphil Novum the right to the first publication. The authors retain copyright to earlier versions of the articles, such as the submitted and the accepted manuscript. Authors and readers may use, reuse, and build upon the published work, use it for text or data mining or for any other lawful purpose, as long as appropriate attribution is maintained.
Articles in volumes 1-8 are not licensed under Creative Commons. In these volumes, all rights are reserved to the authors of the articles respectively. This implies that readers can download, read, and link to the articles, but they cannot republish the articles. Authors may post the published version of their article to their personal website, institutional repository, or a repository required by their funding agency as a part of a green open access policy.