A semset is the use of a series of alternate labels and terms to describe a concept or entity. These alternatives include true synonyms, but may also be more expansive and include jargon, slang, acronyms or alternative terms that usage suggests refers to the same concept. For example, for the concept of "United States of America" possible semset entries could include:
|Uncle Sam||Great Satan||the States||USA|
The semset construct is similar to the "synsets" in Wordnet, but with a broader use understanding. Included in the semset construct is the single (per language) preferred (human-readable) label for the concept, the
prefLabel, an embracing listing of alternative phrase and terms for the concept (including acronyms, synonyms, and matching jargon), the
altLabels, and a listing of prominent or common misspellings for the concept or its alternatives, the
The semset construct is an integral part of our recommended ontology best practices and is essential to the use of ontologies for information extraction and tagging of unstructured text.