Semset Concept

From UMBEL Wiki
Jump to: navigation, search

A semset is the use of a series of alternate labels and terms to describe a concept or entity. These alternatives include true synonyms, but may also be more expansive and include jargon, slang, acronyms or alternative terms that usage suggests refers to the same concept. For example, for the concept of "United States of America" possible semset entries could include:

United States America Amerika US
Uncle Sam Great Satan the States USA

The semset construct is similar to the "synsets" in Wordnet, but with a broader use understanding. Included in the semset construct is the single (per language) preferred (human-readable) label for the concept, the prefLabel, an embracing listing of alternative phrase and terms for the concept (including acronyms, synonyms, and matching jargon), the altLabels, and a listing of prominent or common misspellings for the concept or its alternatives, the hiddenLabels.

The semset construct is an integral part of our recommended ontology best practices and is essential to the use of ontologies for information extraction and tagging of unstructured text.