GBI Treebanks as a Resource for New Applications
DOI:
https://doi.org/10.7146/hn.v5i2.142738Keywords:
tree banks, dependency treesAbstract
Global Bible Initiative (GBI) have developed Hebrew OT treebanks and Greek NT syntactic treebanks. The treebanks were first generated with a parser using computerized Hebrew and Greek grammars and then proofed verse by verse by Hebrew and Greek Scholars. All the corrections made by the scholars were kept as disambiguation data.
The phrase structures in the trees have been used to build interlinears, concordances, and translation memories which operate not only on the word level, but on the phrase and clause levels as well. The syntactic relations (dependencies) in the trees have also been used to do smart search where we can find texts that are different in form but similar in meaning.
Recently, we have also used the trees to improve the accuracy of automatic word alignment and explore tree-based interactive machine translation of the Bible. The auto aligner can be used to the Hebrew and Greek texts to translations in various languages. The interactive machine translation will speed up Bible translation without compromising quality by providing real time suggestions and checking.
We have already contributed two sets of Greek trees to Creative Commons, the Nestle 1904 version and the SBLGNT version. We also have trees for NA27 and NA28, but we do not own the texts. The Hebrew OT treebank we developed was owned by the Groves Center. We are also capable of creating new treebanks with the parser, grammar, and disambiguation data we own if we are given a text that is morphologically tagged.
Downloads
Published
How to Cite
Issue
Section
License
Counting from volume 9 (2024), articles published in HIPHIL Novum are licensed under Attribution-ShareAlike 4.0 International (CC BY-SA 4.0). The editorial board may accept other Creative Commons licenses for individual articles, if required by funding bodies e.g. the European Research Council. With the publication of volume 9, authors retain copyright to their articles and give Hiphil Novum the right to the first publication. The authors retain copyright to earlier versions of the articles, such as the submitted and the accepted manuscript. Authors and readers may use, reuse, and build upon the published work, use it for text or data mining or for any other lawful purpose, as long as appropriate attribution is maintained.
Articles in volumes 1-8 are not licensed under Creative Commons. In these volumes, all rights are reserved to the authors of the articles respectively. This implies that readers can download, read, and link to the articles, but they cannot republish the articles. Authors may post the published version of their article to their personal website, institutional repository, or a repository required by their funding agency as a part of a green open access policy.