Linguistic Normalisation in Language Industry. Some Normative and Descriptive Aspects of Dictionary Development

Authors

  • Jan Engh

DOI:

https://doi.org/10.7146/hjlcb.v6i10.21520

Abstract

For commercial software with natural language functions, a high coverage is required. This implies that only extensive lexica and complete morphologies are of interest to the language industry. For many languages, lexical and morphological information has to be collected from traditional lexicographic files and printed dictionaries. However, such material may not provide adequate information - even if trivial defects such as misprintings and editorial inconsequences are left out of account. The present paper is an attempt to point out how basic information on any language drawn from traditional sources has to be controlled for normative correctness and descriptive adequacy, and how normalisation can only be defined relative to a given application. The presentation is based on the author's experience, and the examples are all Norwegian. Still, it is assumed to be of general nature, hightlighting some very fundamental aspects of computational linguistics which are often neglected in practice, which "everybody" is aware of all the same, but very few - if anyone - has bothered to discuss in writing.

Downloads

Published

1993-07-29

How to Cite

Engh, J. (1993). Linguistic Normalisation in Language Industry. Some Normative and Descriptive Aspects of Dictionary Development. HERMES - Journal of Language and Communication in Business, 6(10), 53–64. https://doi.org/10.7146/hjlcb.v6i10.21520

Issue

Section

Articles