Fra begrebsordbog til sprogteknologisk ressource: verber, semantiske roller og rammer – et pilotstudie
Resumé
This paper describes a method of compiling a lexicon of Danish semantic frames within the model of the Berkeley FrameNet (BFN). Large groups of near-synonymous verbs and verbal nouns, including multiword units, within the domains of communication and cognition are identified and extracted from the source manuscript of a newly published Danish the-saurus. Each word or expression is then assigned an appropriate frame from BFN. The fact that words within the same domain all belong to a manageable subset of frames in BFN makes is possible to map a high number of words to their corresponding frames simultaneously. In a forthcoming annotation project where words within the same two do-mains are already identified in the corpus, the idea is to pre-annotate with the frames in our lexicon, leaving afterwards human annotators to con-firm the frame and test whether it is possible to identify the BFN seman-tic roles described for English in the Danish text. Our method reveals some interesting divergences between the semantic divisions established in the thesaurus in contrast to the ones found in BFN, showing that the two resources contribute with different types of linguistic information and thereby constitute a useful supplement to one another.
Downloads
Publiceret
Citation/Eksport
Nummer
Sektion
Licens
Nordisk Forening for Leksikografi/NSL og forfatterne.