UMBEL - Annex J 20160510

From UMBEL Wiki
Jump to: navigation, search
UMBEL Annex J: Geo Module and GeoNames Mapping

UMBEL Annex Document - 20 April 2015

Latest version
http://techwiki.umbel.org/index.php/UMBEL_-_Annex_J
UMBEL Logo
Last update
$Date: 2015/4/20 14:28:36 $
Version
Version No.: 1.20
Volume
TR 12-5-21-J
Authors
Michael Bergman - Structured Dynamics
Frédérick Giasson - Structured Dynamics

Structured Dynamics Logo

UMBEL: Upper Mapping and Binding Exchange Layer by Structured Dynamics LLC is provided under the
Creative Commons Attribution 3.0 license. See the attribution section for how to cite the effort.

Creative Commons License

Copyright © 2009-2015 by Structured Dynamics LLC.

Beginning with UMBEL version 1.20, statistics regarding numbers of reference concepts (RCs) in the ontology and splits between SuperTypes (STs) and modules have been moved to the statistics Annex Z document. As a result, earlier statistics in this and other annexes are no longer being updated, which means any statistics cited below may be out of date. Please consult Annex Z for the current UMBEL statistics.
Please see current mappings for the latest mappings available. Please see the modules section, also on GitHub, for the latest module version available.

INTRODUCTION

This document describes: 1) the modularization of the UMBEL ontology and the split out of a specific 'Geo' (geographical) module; 2) mapping of the new UMBEL 'core' to the GeoNames geography knowledge base and ontology[1]; and 3) recommended uses for these new structures. These changes occurred in conjunction with the preparation and release of UMBEL version 1.05.

RATIONALE

Even in the early development of UMBEL there was a tension about the scope and level of what geographic information to include in its concept base. The initial decision was to support country and leading-country province and state concepts, and some leading cities. This decision was in the spirit of a general reference structure, but still felt arbitrary.

GeoNames is devoted to geographical information and concepts -- both natural and human artifacts -- and has become the go-to resource for locational information and concepts. Because of its dedicated nature, GeoNames is, and will likely remain, a superior source to the arbitrary geo-locational inclusions in UMBEL.

The decision was thus made to split out the initial geo-locational information in UMBEL and replace it with mappings to GeoNames. This decision also had the advantage of beginning a process of modularization of UMBEL. There may be other standard ontologies -- GoodRelations comes to mind, for example, in the area of products and services[2] -- that might also provide useful adjuncts to the basic orienting structure of UMBEL. Modularization also provides users choice in how to deploy the system; see the recommended uses concluding section.

MODULARIZATION AND SPLIT OF UMBEL

Two sets of reference concepts were identified as useful for splitting out from the 'core' UMBEL in a geo-locational aspect:

  1. Geopolitical places and places of human activities and facilities
  2. Natural geographical places and features.

Thus, three iterations of manual inspection of the then-current UMBEL concept base was undertaken to identify and remove these concepts.

Resulting Modules and Statistics

  • A total of 995 geopolitical reference concepts were removed based on these inspections
  • A total of 860 natural place reference concepts were removed based on these inspections
  • These removed concepts were then placed into a separate 'geo' module of UMBEL, including all existing annotations and relations and some other minor adjustments, resulting in a module of 1,854 concepts
  • That left 26,022 concepts in UMBEL 'core'. Because of the need to add some new concepts for complete GeoNames mapping (see next) and other mappings (see Annex I, that resulted in a UMBEL 'core' of 26,046 concepts.

These are now the modular splits in UMBEL version 1.05.

GEONAMES MAPPING

GeoNames has a different structure to UMBEL. From a structural basis, it has few classes. It distinguishes its geographic information on the basis of some 671 feature codes. These codes span from geopolitical divisions -- such as countries, states or provinces, cities, or other administrative districts -- to splits and aggregations by natural and human features. Types of physical terrain -- above ground and underwater -- are denoted, as well as regions and landscape features governed by human activities (such as vineyards or ecological regions).[3]

Fortunately, there are corresponding concepts for most of these feature codes already in UMBEL. Where they were lacking, some new concepts were also added to UMBEL (see below).

Class <-> Feature Mappings

The 671 GeoNames feature codes were thus manually mapped to corresponding classes in the UMBEL concepts. This mapping was done by, firstly, searching and inspecting the current UMBEL concept base for appropriate matches. If that mapping failed to find a rather direct correspondence between existing UMBEL concepts and the feature, the source concept reference of OpenCyc was then inspected in a similar manner. Failing a match from either of these two sources, the decision was to add a new concept to the 'core' UMBEL.

In the course of that effort, some 88 GeoNames feature codes without proper matches were added to UMBEL. Another three (3) feature codes were added back in from the source OpenCyc reference base for concepts.[4]

Added OpenCyc Concepts

Three existing OpenCyc concepts were added for these mappings, as shown under the ID column for their respective Parent concepts:

Parent(s) ID prefLabel Definition
GeopoliticalEntity SecondOrderAdministrativeDivision second order administrative division A subdivision of a first-order administrative division
GeopoliticalEntity ThirdOrderAdministrativeDivision third order administrative division A subdivision of a second-order administrative division
GeopoliticalEntity FourthOrderAdministrativeDivision fourth order administrative division A subdivision of a third-order administrative division

Added UMBEL Concepts

As noted, another 88 new reference concepts were also added to UMBEL, as shown under the ID column for their respective Parent concepts::

Parent(s) ID prefLabel Definition
Path_Customary AccessCorridor corridor A strip or area of land having significance as an access way
Stream, Bypass_Path Anabranch anabranch A diverging branch flowing out of a main stream and rejoining it downstream
UnderwaterTopographicalFeature Arrugado_Undersea undersea arrugado An area of subdued corrugations off Baja California
Building AslyumBuilding asylum A facility where the insane are cared for and protected
Unrefined_oil, TopographicalFeature AsphaltLake asphalt lake A small basin containing naturally occurring asphalt
BodyOfWater BodyOfWater_Flowing flowing body of water A flowing body of water that moves at various speeds from rivers to slow-moving streams or slower
BodyOfWater BodyOfWater_Small small body of water Bodies of waters that are smaller than lakes
BushyTerrain, OutdoorLocation Bush_Clump bush(es) A small clump of conspicuous bushes in an otherwise bare area
GroundsOfOrganization, Workplace BusinessCenter business center A place where a number of businesses are located
UnderwaterTopographicalFeature Cavity_Undersea undersea hole A small depression of the sea floor
Channel_WaterPath Channel_Lake lake channel(s) That part of a lake having water deep enough for navigation between islands, shoals, etc.
Channel_WaterPath Channel_Narrows narrows A navigable narrow part of a bay, strait, river, etc.
GroundsOfOrganization CoastGuardStation coast guard station A facility from which the coast is guarded by armed vessels
TopographicalFeature Cone_Topographical cone(s) A conical landform composed of mud or volcanic material
UnderwaterTopographicalFeature Continental_Rise_Undersea undersea continental rise A gentle slope rising from oceanic depths towards the foot of a continental slope
UnderwaterTopographicalFeature Cordillera_Undersea undersea cordillera An entire mountain system including the subordinate ranges, interior plateaus, and basins
UnderwaterLocation Deep_Undersea deep A localized deep area within the confines of a larger feature, such as a trough, basin or trench
Channel_WaterPath Distributary distributary(-ies) A branch which flows away from the main stream, as in a delta or irrigation canal
DrainageBasin DrainageArea drainage basin An area drained by a stream
Path_Spatial DrainageDivide divide A line separating adjacent drainage basins
Region, GeopoliticalEntity EconomicDevelopmentRegion economic region A region of a country established for economic development or for statistical purposes
UnderwaterTopographicalFeature Escarpment_Undersea undersea escarpment (or scarp) An elongated and comparatively steep slope separating flat or gently sloping areas
GroundsOfOrganization Facility_Government_US united states government establishment A facility operated by the United States Government in Panama
AgriculturalFacility Facilty_Agricultural agricultural facility A building and/or tract of land used for improving agriculture
Facilty_Educational Facilty_Educational educational facility A building or other structure used for the education of students
ArtificialWaterFeature Facilty_Water water facility A building or other structure used for the storage, diversion, treatment, supply or use of water
Region, Facilities FishingRegion fishing area A fishing ground, bank or area where fishermen go to catch fish
UnderwaterTopographicalFeature Flat_Topographical_Undersea undersea flat A small level or nearly level area
UnderwaterTopographicalFeature Fork_Topographical_Undersea undersea forks A branch of a canyon or valley
Fossil, NaturalThing, OutdoorLocation FossilForest fossilized forest A forest fossilized by geologic processes and now exposed at the earth's surface
CapitalCityOfRegion FourthOrderSeat fourth order seat Seat of a fourth-order administrative division
UnderwaterTopographicalFeature Fracture_Zone_Undersea undersea fracture zone An extensive linear zone of irregular topography of the sea floor, characterized by steep-sided or asymmetrical ridges, troughs, or escarpments
UnderwaterTopographicalFeature Furrow_Topographical_Undersea undersea furrow A closed, linear, narrow, shallow depression
UnderwaterTopographicalFeature Gap_Topographical_Undersea undersea gap A narrow break in a ridge or rise
GeopoliticalEntity GeopoliticalDependentEntity geopolitical dependency Dependent political entity
CapitalCityOfRegion GeopoliticalEntitySeat geopolitical entity seat Seat of government of a political entity
GeopoliticalEntity GeopoliticalIndependentEntity independent geopolitical entity Independent political entity
GeopoliticalEntity GeopoliticalIndependentEntityPart independent political entity section Section of independent political entity
GeopoliticalEntity GeopoliticalSemiIndependentEntity semi-independent political entity Semi-independent political entity
UnderwaterTopographicalFeature Gully_Undersea undersea gully A small valley-like feature
PopulatedPlace HaltingPlace halting place A place where caravans stop for rest
GroundsOfOrganization, Workplace Headquarters_Park park headquarters A park administrative facility
Channel_WaterPath Headwater headwaters The source and upper part of a stream, including the upper drainage basin
UnderwaterTopographicalFeature Hill_Undersea undersea hill An elevation rising generally less than 500 meters
UnderwaterTopographicalFeature HillRange_Undersea undersea range A series of associated ridges or seamounts
UnderwaterTopographicalFeature Knoll_Undersea undersea knolls Elevations rising generally more than 500 meters and less than 1,000 meters and of limited extent across the summits
NaturalRegion LakeRegion lake region A tract of land distinguished by numerous lakes
Facilty_Educational LearningCenterBuilding continuous learning center A center or facility for continuous learning
LotOfLand LeasedTerritory leased area A tract of land leased by the United Kingdom from the People's Republic of China to form part of Hong Kong
UnderwaterTopographicalFeature Ledge_Topographical_Undersea undersea ledge A rocky projection or outcrop, commonly linear and near shore
UnderwaterTopographicalFeature Levee_Undersea undersea levee An embankment bordering a canyon, valley, or seachannel
UnderwaterTopographicalFeature Mesa_Undersea undersea mesa An isolated, extensive, flat-topped elevation on the shelf, with relatively steep sides
UnderwaterTopographicalFeature Moat_Undersea undersea moat An annular depression that may not be continuous, located at the base of many seamounts, islands, and other isolated elevations
UnderwaterTopographicalFeature Mountain_Undersea undersea mountains Well-delineated subdivisions of a large and complex positive feature
Junction_Connector, TransportFacility OilPipeline_Juncture oil pipeline junction A section of an oil pipeline where two or more pipes join together
TopographicalFeature Pan_Topographical pan (topographical) A near-level shallow, natural depression or basin, usually containing an intermittent lake, pond, or pool
Unrefined_oil, TopographicalFeature PetroleumBasin petroleum basin An area underlain by an oil-rich structural basin
UnderwaterTopographicalFeature Pinnacle_Undersea undersea pinnacle A high tower or spire-shaped pillar of rock or coral, alone or cresting a summit
UnderwaterTopographicalFeature Plateau_Undersea undersea plateau A comparatively flat-topped feature of considerable extent, dropping off abruptly on one or more sides
UnderwaterTopographicalFeature Platform_Undersea undersea platform A flat or gently sloping underwater surface extending seaward from the shore
ReclaimedLandBasin Polder polder An area reclaimed from the sea by diking and draining
PopulatedPlace PopulatedLocality populated locality An area similar to a locality but with a small group of dwellings or other buildings
PopulatedPlace PopulatedPlaceDestroyed destroyed populated place A village, town or city destroyed by a natural disaster, or by war
PopulatedPlace PopulatedPlacePart populated place section Section of a populated place
Facilty_Educational PreparationCenter exam preparation center Exam preparation center
UnderwaterTopographicalFeature Province_Undersea undersea province A region identifiable by a group of similar physiographic features whose characteristics are markedly in contrast with surrounding areas
RoadwayJunction, Junction_Connector, TransportFacility RailwayJunction railroad junction A place where two or more railroad tracks join
UnderwaterTopographicalFeature Ravine_Undersea undersea ravine A small canyon
GeographicalRegion Region_Artifactual artifactual region A geographical region that is not natural; that has been defined by human activities or use
PopulatedPlace ReligiousSettlement religious populated place A populated place whose population is largely engaged in religious occupations
UnderwaterTopographicalFeature Ridge_Undersea undersea ridge A long narrow elevation with steep sides
UnderwaterTopographicalFeature Rise_Undersea undersea rise A broad elevation that rises gently, and generally smoothly, from the sea floor
UnderwaterTopographicalFeature Saddle_Undersea undersea saddle A low part, resembling in shape a saddle, in a ridge or between contiguous seamounts
SaltFlat SaltArea salt area A shallow basin or flat where salt accumulates after periodic inundation
GroundsOfOrganization ScienceResearchBase scientific research base A scientific facility used as a base from which research is carried out or monitored
Channel_WaterPath Seachannel seachannel A continuously sloping, elongated depression commonly found in fans or plains and customarily bordered by levees on one or two sides
UnderwaterTopographicalFeature Shelf_Valley_Undersea undersea shelf valley A valley on the shelf, generally the shoreward extension of a canyon
UnderwaterTopographicalFeature Sill_TerrainFeature_Undersea undersea sill The low part of a gap or saddle separating basins
UnderwaterTopographicalFeature Slope_Topographical_Undersea undersea slope The slope seaward from the shelf edge to the beginning of a continental rise or the point where there is a general reduction in slope
UnderwaterTopographicalFeature Spur_Topographical_Undersea undersea spur A subordinate elevation, ridge, or rise projecting outward from a larger feature
Channel_WaterPath StreamBend stream bend A conspicuously curved or bent segment of a stream
Channel_WaterPath StreamMouth stream mouth(s) A place where a stream discharges into a lagoon, lake, or the sea
UnderwaterTopographicalFeature Terrace_Topographical_Undersea undersea terrace A relatively flat horizontal or gently inclined surface, sometimes long and narrow, which is bounded by a steeper ascending slope on one side and by a steep descending slope on the opposite side
GeographicalPlace TriangulationPlace triangulation station A point on the earth whose position has been determined by triangulation
UnderwaterTopographicalFeature Trough_Topographical_Undersea undersea trough A long depression of the sea floor characteristically flat bottomed and steep sided, and normally shallower than a trench
UnderwaterTopographicalFeature Valley_Undersea undersea valley The axial depression of the mid-oceanic ridge system
BodyOfWater_Flowing WaterCurrent water current A horizontal flow of water in a given direction with uniform velocity
Workplaces Workplace_Temporary temporary work office A temporary work office

Note that the predominant share of new concepts is related to underwater terrain features. This is a minor source of information, but provides completeness to GeoNames nonetheless.

Choice of Mapping Predicate

As noted, there is a structural difference between UMBEL and GeoNames. This does not lend itself to typical mapping approaches.

As Annex H observes, for some time the semantic Web community has grappled with the issue of the owl:sameAs predicate. Among other options along a spectrum of relatedness is the desire to assign a predicate that is meant to represent the same kind of thing, yet without knowing if the relationship is an equivalence (identity, or owl:sameAs), a subset, or merely just a member of relationship.

In this case between feature code and a true class relationship, we want an umbrella property that is generally equivalent or owl:sameAs in nature, but perhaps if known precisely might actually encompass a degree of approximation. The appropriate choice here is umbel:correspondsTo, which was designed specifically for this purpose.[5] To understand how this predicate applies, it is worthwhile to repeat its formal specification:

Property name umbel:correspondsTo
Description The property umbel:correspondsTo is used to assert a close correspondence between an external class, named entity, individual or instance with a Reference Concept class. umbel:correspondsTo relates the external class, named entity, individual or instance to the class through the basis of both its subject matter and intended scope. This predicate should be used where the correspondence between the two entities is felt to be nearly equivalent to an owl:sameAs assertion, and is reflexive, but without the full entailments of intensional class memberships. In these cases, both entities are understood to have the same type and intended scope, but without asserting a full class-level or owl:sameAs individual relationship.

This predicate is designed for the circumstance of aligning two different ontologies or knowledge bases based on node-level correspondences, but without entailing the actual ontological relationships and structure of the object source. For example, the umbel:correspondsTo predicate is used to assert close correspondence between UMBEL Reference Concepts and Wikipedia categories or pages, yet without entailing the actual Wikipedia category structure.

This property asserts a different and stronger relationship than umbel:isAbout. One practical use is to guide specific instance member determinations when, say, the native structure of the external ontology or knowledge base is to be analyzed and replaced with an UMBEL-based structure.

This property is therefore used to create a nearly equivalent assertion (however, with the degree of that equivalence being unknown or unknowable) between an external instance or class and a Reference Concept class.

Domain owl:Thing
Range umbel:RefConcept
Reflexive True
Status Testing

This predicate is thus the basis for the 671 mappings between UMBEL and GeoNames feature codes.

RECOMMENDED USES

Here are some tentative guidances as to how to use these modules and mappings.

Modes

There are thus three modes in which one can now work with UMBEL:

  1. With UMBEL 'core' alone, recommended when your concept space is not concerned with geographical information
  2. UMBEL 'core' plus the UMBEL 'geo' module -- equivalent to prior versions of UMBEL, or
  3. UMBEL 'core' plus GeoNames, recommended where geographical information is important to your concept space.

Actual Use

You can load the core UMBEL without the modules, and if you want a module, simply import it into the ontology. Then, all the concepts will properly relate to the concepts of the UMBEL core.

If used in a modular fashion, your should:

  1. Load the base reference concepts (https://github.com/structureddynamics/UMBEL/blob/master/Reference%20Structure/umbel_reference_concepts.n3 in this case)
  2. then, either:
    1. load the UMBEL geo module, or
    2. invoke a mapping file to an external correspondence (using GeoNames in our case).

In the latter case, you may use SPARQL queries with the umbel:correspondsTo predicate to achieve the desired retrievals. If more logic is required, you will likely need to look to a rules-based addition such as SWRL[6] or RIF[7] to capture the umbel:correspondsTo semantics.

ASSOCIATED FILES

ENDNOTES

  1. 1.0 1.1 For more information on GeoNames, see http://www.geonames.org/. The complete mapping to GeoNames is based on its 671 feature codes, which describe natural, geopolitical, and human activity geo-locational information; see further http://www.geonames.org/statistics/total.html.
  2. See http://www.heppnetz.de/projects/goodrelations/.
  3. See these statistics and codes related to the GeoNames feature codes: http://www.geonames.org/statistics/total.html
  4. See further the discussion of the use of OpenCyc in the UMBEL specifications.
  5. Approximate relationships are discussed in M.K. Bergman, 2010. "The Nature of Connectedness on the Web," AI3:::Adaptive Information blog, November 22, 2010; see http://www.mkbergman.com/935/the-nature-of-connectedness-on-the-web/. One option, for example, is the x:coref predicate from the UMBC Ebiquity group; see further Jennifer Sleeman and Tim Finin, 2010. "Learning Co-reference Relations for FOAF Instances," Proceedings of the Poster and Demonstration Session at the 9th International Semantic Web Conference, November 2010; see http://ebiquity.umbc.edu/_file_directory_/papers/522.pdf. In the words of Tim Finin of the Ebiquity group:
    The solution we are currently exploring is to define a new property to assert that two RDF instances are co-referential when they are believed to describe the same object in the world. The two RDF descriptions might be incompatible because they are true at different times, or the sources disagree about some of the facts, or any number of reasons, so merging them with owl:sameAs may lead to contradictions. However, virtually merging the descriptions in a co-reference engine is fine -- both provide information that is useful in disambiguating future references as well as for many other purposes. Our property (:coref) is a transitive, symmetric property that is a super-property of owl:sameAs and is paired with another, :notCoref that is symmetric and generalizes owl:differentFrom.

    When we look at the analog properties noted above, we see that the property objects tend to share reflexivity, symmetry and transitivity. We specifically designed the umbel:correspondsTo predicate to capture these close, nearly equivalent, but uncertain degree of relationships.

  6. SWRL (Semantic Web Rule Language) combines sublanguages of the OWL Web Ontology Language (OWL DL and Lite) with those of the Rule Markup Language (Unary/Binary Datalog). SWRL has the full power of OWL DL, but at the price of decidability and practical implementations. See further http://www.w3.org/Submission/SWRL/.
  7. The Rule Interchange Format (RIF) is a W3C Recommendation. RIF is based on the observation that there are many "rules languages" in existence, and what is needed is to exchange rules between them. RIF includes three dialects, a Core dialect which is extended into a Basic Logic Dialect (BLD) and Production Rule Dialect (PRD). See further http://www.w3.org/2005/rules/wiki/RIF_FAQ.
Copyright © 2009-2015 by Structured Dynamics LLC.