SEMIC

Core Location Vocabulary

Status
Working Draft
Published at
2021-04-01
This version
https://semiceu.github.io/Core-Location-Vocabulary/releases/2.00

Summary

Locations can be described in three principal ways: by using a place name, a geometry or an address. The specific context will determine which method of describing a location is most appropriate. The Core Location Vocabulary provides structure for all three.

ISO 19112 defines a location as "an identifiable geographic place." With this in mind, "Eiffel Tower", "Madrid" and "California" are all locations and this is a common way of representing locations in public sector data, i.e. simply by using a recognised name. Such identifiers are common although they can be highly ambiguous as many places share the same or similar names.

In addition to a simple (string) label or name for a Location, this vocabulary defines a property that allows a Location to be defined by a URI, such as a GeoNames or DBpedia URI.

No cardinality constraints are placed on any property of the Location, Address or Geometry classes in order to maximise flexibility. A single address may be defined in different ways, a geometry may be defined using different coordinate reference systems and a single place may have no recognised name or multiple names. The Core Location Vocabulary makes a minimum number of assumptions about what data will be encoded. However, it clearly makes no sense to define any of the location classes without any properties or to provide multiple instances of the same property with conflicting values.

Status of this document

This Core Vocabulary has the status of Working Draft published on 2021-04-01.

Conformance

TBD

Overview

This document describes the usage of the following entities for a correct usage of the Core Vocabulary:
| Address | Geometry | Location | Resource |

Entities

Address

Description
A spatial object that in a human-readable way identifies a fixed location of a property.
Usage

An "address representation" as conceptually defined by the INSPIRE Address Representation data type: "Representation of an address spatial object for use in external application schemas that need to include the basic, address information in a readable way.".


The representation of Addresses varies widely from one country's postal system to another. Even within countries, there are almost always examples of Addresses that do not conform to the stated national standard. At the time of publication, work is progressing on ISO 19160-1 that defines a method through which different Addresses can be converted from one conceptual model to another.


This specification was heavily based on the INSPIRE Address Representation data type. It is noteworthy that if an Address is provided using the detailed breakdown suggested by the properties for this class, then it will be INSPIRE-conformant. To this very granular set of properties, we add two further properties:

  • - full address (the complete address as a formatted string)
  • - addressID (a unique identifier for the address)
The first of these allows publishers to simply provide the complete Address as one string, with or without formatting. This is analogous to vCard's label property.


The addressID is part of the INSPIRE guidelines and provides a hook that can be used to link the Address to an alternative representation, such as vCard or OASIS xAL.

Properties
For this entity the following properties are defined: address area, address ID, administrative unit level 1 (country), administrative unit level 2 (county/region/state), full address, locator designator, locator name, post code, post name (city), post office box, thoroughfare.
Property Expected Range Description Usage Codelist
address area Text The name or names of a geographic area or locality that groups a number of addressable objects for addressing purposes, without being an administrative unit. This would typically be part of a city, a neighbourhood or village, e.g. Montmartre.
address ID String A globally unique identifier for each instance of an Address.

The concept of adding a globally unique identifier for each instance of an address is a crucial part of the INSPIRE data spec. A number of EU countries have already implemented an ID (a UUID) in their Address Register/gazetteer, among them Denmark. OASIS xAL also includes an address identifier. It is the address Identifier that allows an address to be represented in a format other than INSPIRE whilst remaining conformant to the Core Vocabulary.


The INSPIRE method of representing addresses is very detailed, designed primarily for use in databases of addresses. Whilst data that is published in full conformance with the INSPIRE data structure can be made available using the Location Core Vocabulary the reverse is not true since the Core Vocabulary allows much greater flexibility.


Many datasets that include address data as one piece of information about something else are likely to have that data in simpler formats. These might be tailored to the specific need of the dataset, follow a national norm, or make use of a standard like vCard.


To provide maximum flexibility in the Core Vocabulary, whilst remaining interoperable with INSPIRE Address Guidelines (which EU Member States are obliged to use), the Location Core Vocabulary provides the extra property of full address and makes use of INSPIRE's addressID.

administrative unit level 1 (country) Code The name or names of a unit of administration related to the exercise of jurisdictional rights, for local, regional and national governance. Level 1 refers to the uppermost administrative unit for the address, almost always a country. Best practice is to use the ISO 3166-1 code but if this is inappropriate for the context, country names should be provided in a consistent manner to reduce ambiguity. For example, either write 'France' or 'FRA' consistently throughout the dataset and avoid mixing the two. The Country controlled vocabulary from the Publications Office can be reused for this.
administrative unit level 2 (county/region/state) Text The name or names of a unit of administration related to the exercise of jurisdictional rights, for local, regional and national governance. Level 2 referst to the region of the address, usually a county, state or other such area that typically encompasses several localities. Some recommended codelists from the EU Publications Office include: Administrative Territorial Units (ATU), NUTS and Local Administrative Units (LAU). The first arrondissement of Paris is for example expressed as "http://publications.europa.eu/resource/authority/atu/FRA_AR_PAR01" in the ATU controlled vocabulary.
full address Text The complete address written as a formatted string. Use of this property is recommended as it will not suffer any misunderstandings that might arise through the breaking up of an address into its component parts. This property is analogous to vCard's label property but with two important differences: (1) formatting is not assumed so that, unlike vCard label, it may not be suitable to print this on an address label, (2) vCard's label property has a domain of vCard Address; the fullAddress property has no such restriction. An example of a full address is "Champ de Mars, 5 Avenue Anatole France, 75007 Paris, France".
locator designator String A number or a sequence of characters which allows a user or an application to interpret, parse and format the locator within the relevant scope. A locator may include more locator designators. In simpler terms, this is the building number, apartment number, etc. For an address such as "Flat 3, 17 Bridge Street", the locator is "flat 3, 17".
locator name Text Proper noun(s) applied to the real world entity identified by the locator.

The locator name could be the name of the property or complex, of the building or part of the building, or it could be the name of a room inside a building.


The key difference between a locator and a locator name is that the latter is a proper name and is unlikely to include digits. For example, "Shumann, Berlaymont" is a meeting room within the European Commission headquarters for which locator name is more appropriate than locator.

post code String The post/zip code of an address. (INSPIRE's definition is "A code created and maintained for postal purposes to identify a subdivision of addresses and postal delivery points.") Post codes are common elements in many countries' postal address systems. One of the many post codes of Paris is for example "75000".
post name (city) Text The key postal division of the address, usually the city. (INSPIRE's definition is "One or more names created and maintained for postal purposes to identify a subdivision of addresses and postal delivery points.) For example, "Paris".
post office box String The Post Office Box number. INSPIRE's name for this is "postalDeliveryIdentifier" for which it uses the locator designator property with a type attribute of that name. This vocabulary separates out the Post Office Box for greater independence of technology. An example post office box number is "9383".
thoroughfare Text An address component that represents the name or names of a passage or way through from one location to another. A thoroughfare is not necessarily a road, it might be a waterway or some other feature. For example, "Avenue des Champs-Élysées".

Geometry

Description
The Geometry class provides the means to identify a Location as a point, line, polygon, etc. expressed using coordinates in some coordinate reference system.
Usage
This class defines the notion of "geometry" at the conceptual level, and it shall be encoded by using different formats (see usage note of the locn:geometry property). We also refer to the Examples section of this specification for a number of different geometry examples expressed in different formats.
Properties
For this entity the following properties are defined: coordinates, crs, geometry type, gml, latitude, longitude, wkt.
Property Expected Range Description Usage Codelist
coordinates String The coordinate list.
crs URI An identifier for the coordinate reference system.
geometry type Code The geometry type, e.g. point, line or polygon.
gml Literal The geometry written in Geography Markup Language. Use "http://www.opengis.net/ont/geosparql#gmlLiteral" as type for the literal.
latitude String The latitude.
longitude String The longitude.
wkt Literal The well-known text representation string describing the point, line or polygon. Use "http://www.opengis.net/ont/geosparql#wktLiteral" as type for the literal.

Location

Description
An identifiable geographic place or named place.
Properties
For this entity the following properties are defined: address, geographic identifier, geographic name, geometry.
Property Expected Range Description Usage Codelist
address Address The address relationship associates any Resource with the Address class (i.e. anything can be linked to its address using this property). Asserting the address relationship implies that the Resource has an Address.
geographic identifier URI A URI that identifies the Location.

GeoNames.org provides stable, widely recognised identifiers for more than 10 million geographical names that can be used as links to further information. For example, http://sws.geonames.org/593116/ identifies the Lithuanian capital Vilnius. Unfortunately these URIs cannot easily be automatically deduced since the URI scheme uses simple numeric codes. Finding a GeoNames identifier for a Location is almost always a manual process. Where such identifiers are known or can be found, however, it is recommended that they be used.


Where the Location Class is used to identify a country, if the geonames URI is not known, the recommendation is to use DBpedia URIs of the form http://dbpedia.org/resource/ISO_3166-1:XX where XX is the ISO 3166 two character code for the country.


The EU's Publication Office diverges from ISO 3166-1 and uses EL and UK for Greece and the United Kingdom respectively. DBpedia sticks to the ISO codes and so the correct URIs for these countries are:

  • - http://dbpedia.org/resource/ISO_3166-1:GR
  • - http://dbpedia.org/resource/ISO_3166-1:GB
    • even when the geographic name is given as EL or UK.


      The use of a URIs has added advantages:

      • - it can be used by automated systems to look up additional data (linked data);
      • - a triple store may store only one copy of the URI, whereas if a string is used, a copy of that string is always stored for each and every person in the database. Thus, in large data sets, the saving on memory capacity and the improvement in transmission efficiency can be substantial.

geographic name Text A geographic name is a proper noun applied to a spatial object.

The INSPIRE Data Specification on Geographical Names provides a detailed model for describing a 'named place', including methods for providing multiple names in multiple scripts. This is beyond what is necessary for the Core Location Vocabulary but, importantly, the concept of a geographic name used here is consistent.


A geographic name is a proper noun applied to a spatial object. Taking the example used in the INSPIRE document (page 15), the following are all valid geographic names for the Greek capital:

  • - "Aθnνa"@gr-Grek (the Greek endonym written in the Greek script)
  • - "Athína"@gr-Latn (the standard Romanisation of the endonym)
  • - "Athens"@en (the English language exonym)
INSPIRE has a detailed (XML-based) method of providing metadata about a geographic name and in XML-data sets that may be the most appropriate method to follow. When using the Core Location Vocabulary in data sets that are not focussed on environmental/geographical data (the use case for INSPIRE), the Code datatype or a simple language identifier may be used to provide such metadata.


The country codes defined in ISO 3166 may be used as geographic names and these are generally preferred over either the long form or short form of a country's name (as they are less error prone). The Publications Office of the European Union recommends the use of ISO 3166-1 codes for countries in all cases except two:

  • - use 'UK' in preference to the ISO 3166 code GB for the United Kingdom;
  • - use 'EL' in preference to the ISO 3166 code GR for Greece.
  • Where a country has changed its name or no longer exists (such as Czechoslovakia, Yugoslavia etc.) use the ISO 3166-3 code.

geometry Literal, Geometry or URI Associates any Resource with the corresponding geometry.

Depending on how a geometry is encoded, the range of this property may be one of the following:


For interoperability reasons, it is recommended using one of the following:


Resource

Description
The class resource, everything.
Properties
For this entity the following properties are defined: location.
Property Expected Range Description Usage Codelist
location Location The location relationship associates any Resource with the Location class. Asserting the location relationship implies only that the domain has some connection to a Location in time or space. It does not imply that the Resource is necessarily at that Location at the time when the assertion is made.

Geometry examples

Geometry RDF examples

WKT (with a typed literal as range)

GML (with a typed literal as range)

WKT (with a geometry class as range)

GML (with a geometry class as range)

RDF (WGS84 lat/long)

RDF (schema.org)

URI reference (GeoHash)

Geometry XML examples

WKT (GeoSPARQL)

GML

URI reference (GeoHash)

Changelog w.r.t. previous version

(non-normative)

A changelog describing the (major) changes to the previous version (1.0.0) of the Core Location Vocabulary and the new version that is being proposed in this specification (2.0.0), can be found here.

JSON-LD context

(non-normative)

A reusable JSON-LD context definition for this Core Vocabulary is retrievable at: /context/core_location.jsonld