Indexeringen av SAOB
We are creating a new digital version of the Swedish Academy dictionary, SAOB. We have previously scanned the printed originals to obtain a correct text in digital format and we are presently attempting to automatically identify important structures in the articles. The structures we identify include title, part of speech, etymology and the division of definitions among others. This paper will give details of our work and a review of how well we are succeeding and what difficulties we have encountered.
Nordisk Forening for Leksikografi/NSL og forfatterne.