Some resources on markup of lexicons
Compiled in preparation for upcoming EMELD workshop
Gary Simons, 13 July 2002
Existing markup proposals:
- R.A.Amsler and F.W.Tompa, 1988.
Standard for English Monolingual Dictionaries, Information in Text:
Proc. 4th Conf. of Univ. of Waterloo Centre for the New OED (October 26-28,
1988), pp. 61-80. The appendix
describes the proposed tags.
- TEI Guidelines, 1994. Chapter 12:
- John Bell and Steven Bird, 2000.
Preliminary Study of the Structure of Lexicon Entries. Proposed
- Erjavec, T., Evans, R., Ide, N., Kilgarriff, A., 2000.
Model for Lexical Databases..Proceedings of the Second Language Resources
and Evaluation Conference (LREC), Athens, Greece, 355-62. Like TEI at
leaf-level, but more abstract approach to structure. An XSLT implementation of
interpreting structure and inheritance of attributes is given in: Ide, N.,
Kilgarriff, A., Romary, L. (2000).
Model of Dictionary Structure and Content. Proceedings of Euralex 2000,
Lists of elements that need to be accounted for in markup:
- List of ~100 markup
fields and ~50 lexical functions from Coward, David F. and Charles E.
Grimes. 1994. Making dictionaries: A guide to lexicography and the
Multi-Dictionary Formatter. Waxhaw, North Carolina: Summer Institute of
- Gibbon, D., Peters, W., Wittenburg, P., (December 2001),
Elements for Lexicon Descriptions, Version 1.0, MPI Nijmegen.
- Conceptual model of lexicon developed by SIL International for LinguaLInks
and FieldWorks. Start with
see the attributes listed on this object and follow links to explore the
attributes of related objects. See also
diagram for lexical database classes.
- Grimes, Joseph E. 1988. Information dependencies in lexical subentries. In
Martha W. Evens (ed.), Relational Models of the Lexicon: Representing
knowledge in semantic networks. Cambridge: Cambridge University Press. pp.
Requirements and markup philosophy:
- Nancy Ide and others, 1992.
encoding machine readable dictionaries, EURALEX'92 Proceedings.
Describes the rationale followed in developing the TEI markup for print
- William Lewis, Scott Farrar, D. Terence Langendoen, 2001.
a Knowledge Base of Morphosyntactic Terminology, Proceedings of the IRCS
Workshop on Linguistic Databases. Articulates the philosophy of markup that has
been set for EMELD: use mapping to a common linguistic ontology in order to be
able to support the preferred markup schemes of contributing linguists.
- Nancy Ide, Laurent Romary, 2001.
for Language Resources , Proceedings of the IRCS Workshop on Linguistic
Databases. Advocates mapping to an abstract markup with standardized semantics.
- Wittenburg, P., Peters, W. and Drude, S.,
Analysis of Lexical Structures
from Field Linguistics and Language Engineering. In: Proceedings of
LREC2002, Las Palmas, 2002. Concludes by advocating an "Abstract Lexicon
Model" and lists basic requirements for such a model.
- Dafydd Gibbon, 2001.
lexical objects and their properties: A contribution to the 'MetaLex'
requirements specification for spoken language lexicon documentation. The
distinction between macrostructure and microstructure is particularly useful.
A taxonomy for classifying types of lexical resources: