SEMIC

Core Person Vocabulary

Status
Working Draft
Published at
2021-04-01
This version
https://semiceu.github.io/Core-Person-Vocabulary/releases/2.00

Summary

The Core Person Vocabulary provides a minimum set of classes and properties for describing a natural person, i.e. the individual as opposed to any role they may play in society or the relationships they have to other people, organisations and property; all of which contribute significantly to the broader concept of identity.

Status of this document

This Core Vocabulary has the status of Working Draft published on 2021-04-01.

Conformance

TBD

Overview

This document describes the usage of the following entities for a correct usage of the Core Vocabulary:
| Address | Concept | Identifier | Jurisdiction | Location | Person | Person |

Entities

Address

Description
A spatial object that in a human-readable way identifies a fixed location of a property.
Usage

An "address representation" as conceptually defined by the INSPIRE Address Representation data type: "Representation of an address spatial object for use in external application schemas that need to include the basic, address information in a readable way.".


The representation of Addresses varies widely from one country's postal system to another. Even within countries, there are almost always examples of Addresses that do not conform to the stated national standard. At the time of publication, work is progressing on ISO 19160-1 that defines a method through which different Addresses can be converted from one conceptual model to another.


This specification was heavily based on the INSPIRE Address Representation data type. It is noteworthy that if an Address is provided using the detailed breakdown suggested by the properties for this class, then it will be INSPIRE-conformant. To this very granular set of properties, we add two further properties:

  • - full address (the complete address as a formatted string)
  • - addressID (a unique identifier for the address)
The first of these allows publishers to simply provide the complete Address as one string, with or without formatting. This is analogous to vCard's label property.


The addressID is part of the INSPIRE guidelines and provides a hook that can be used to link the Address to an alternative representation, such as vCard or OASIS xAL.

Properties
For this entity the following properties are defined: address area, address ID, administrative unit level 1 (country), administrative unit level 2 (country/region/state), full address, locator designator, locator name, post code, post name (city), post office box, thoroughfare.
Property Expected Range Description Usage Codelist
address area Text The name or names of a geographic area or locality that groups a number of addressable objects for addressing purposes, without being an administrative unit. This would typically be part of a city, a neighbourhood or village, e.g. Montmartre.
address ID String A globally unique identifier for each instance of an Address.

The concept of adding a globally unique identifier for each instance of an address is a crucial part of the INSPIRE data spec. A number of EU countries have already implemented an ID (a UUID) in their Address Register/gazetteer, among them Denmark. OASIS xAL also includes an address identifier. It is the address Identifier that allows an address to be represented in a format other than INSPIRE whilst remaining conformant to the Core Vocabulary.


The INSPIRE method of representing addresses is very detailed, designed primarily for use in databases of addresses. Whilst data that is published in full conformance with the INSPIRE data structure can be made available using the Location Core Vocabulary the reverse is not true since the Core Vocabulary allows much greater flexibility.


Many datasets that include address data as one piece of information about something else are likely to have that data in simpler formats. These might be tailored to the specific need of the dataset, follow a national norm, or make use of a standard like vCard.


To provide maximum flexibility in the Core Vocabulary, whilst remaining interoperable with INSPIRE Address Guidelines (which EU Member States are obliged to use), the Location Core Vocabulary provides the extra property of full address and makes use of INSPIRE's addressID.

administrative unit level 1 (country) The name or names of a unit of administration where a Member State has and/or exercises jurisdictional rights, for local, regional and national governance. Level 1 refers to the uppermost administrative unit for the address, almost always a country. Best practice is to use the ISO 3166-1 code but if this is inappropriate for the context, country names should be provided in a consistent manner to reduce ambiguity. For example, either write 'France' or 'FRA' consistently throughout the dataset and avoid mixing the two. The Country controlled vocabulary from the Publications Office can be reused for this.
administrative unit level 2 (country/region/state) Text The name or names of a unit of administration where a Member State has and/or exercises jurisdictional rights, for local, regional and national governance. Level 2 referst to the region of the address, usually a county, state or other such area that typically encompasses several localities. Some recommended codelists from the EU Publications Office include: Administrative Territorial Units (ATU), NUTS and Local Administrative Units (LAU). The first arrondissement of Paris is for example expressed as "http://publications.europa.eu/resource/authority/atu/FRA_AR_PAR01" in the ATU controlled vocabulary.
full address Text The complete address written as a formatted string. Use of this property is recommended as it will not suffer any misunderstandings that might arise through the breaking up of an address into its component parts. This property is analogous to vCard's label property but with two important differences: (1) formatting is not assumed so that, unlike vCard label, it may not be suitable to print this on an address label, (2) vCard's label property has a domain of vCard Address; the fullAddress property has no such restriction. An example of a full address is "Champ de Mars, 5 Avenue Anatole France, 75007 Paris, France".
locator designator String A number or a sequence of characters which allows a user or an application to interpret, parse and format the locator within the relevant scope. A locator may include more locator designators. In simpler terms, this is the building number, apartment number, etc. For an address such as "Flat 3, 17 Bridge Street", the locator is "flat 3, 17".
locator name Text Proper noun(s) applied to the real world entity identified by the locator.

The locator name could be the name of the property or complex, of the building or part of the building, or it could be the name of a room inside a building.


The key difference between a locator and a locator name is that the latter is a proper name and is unlikely to include digits. For example, "Shumann, Berlaymont" is a meeting room within the European Commission headquarters for which locator name is more appropriate than locator.

post code String The post/zip code of an address. (INSPIRE's definition is "A code created and maintained for postal purposes to identify a subdivision of addresses and postal delivery points.") Post codes are common elements in many countries' postal address systems. One of the many post codes of Paris is for example "75000".
post name (city) Text The key postal division of the address, usually the city. (INSPIRE's definition is "One or more names created and maintained for postal purposes to identify a subdivision of addresses and postal delivery points.) For example, "Paris".
post office box String The Post Office Box number. INSPIRE's name for this is "postalDeliveryIdentifier" for which it uses the locator designator property with a type attribute of that name. This vocabulary separates out the Post Office Box for greater independence of technology. An example post office box number is "9383".
thoroughfare Text An address component that represents the name or names of a passage or way through from one location to another. A thoroughfare is not necessarily a road, it might be a waterway or some other feature. For example, "Avenue des Champs-Élysées".

Concept

Description
A SKOS Concept can be viewed as an idea or notion; a unit of thought. However,what constitutes a unit of thought is subjective,and this definition is meant to be suggestive,rather than restrictive.
Properties
No properties have been defined for this entity.

Identifier

Description
The Identifier class represents any identifier issued by any authority, whether a government agency or not. It captures the identifier itself, the type of identifier, and details of the issuing authority, the date on which the identifier was issued.
Usage
The Identifier class is based on the UN/CEFACT class of the same name and is defined under the ADMS namespace.
Properties
For this entity the following properties are defined: date of issue, identifier, identifies, issuing authority, issuing authority URI.
Property Expected Range Description Usage Codelist
date of issue DateTime The date on which the Identifier was assigned.
identifier Literal The value of this property is the Identifier itself. Example: "abc-12345-de"^^<https://belgium.be/scheme/nationalIDnumber>.
identifies Person The identifies relationship links an Identifier class to the resource it identifies.
issuing authority Literal The name of the agency responsible for issuing the Identifier. Example: "Federal Public Service Interior"@en.
issuing authority URI URI The URI of the issuing authority. Example: "https://belgium.be/id/organizations/1233".

Jurisdiction

Description
The extent or range of judicial, law enforcement, or other authority.
Properties
For this entity the following properties are defined: id, name.
Property Expected Range Description Usage Codelist
id URI The value for the id property is a URI for that Jurisdiction.
name Text The name is simply a string that identifies the Jurisdiction, typically a country, with or without a language tag.

Location

Description
An identifiable geographic place or named place.
Properties
For this entity the following properties are defined: geographic identifier, geographic name.
Property Expected Range Description Usage Codelist
geographic identifier URI A URI that identifies the Location.

GeoNames.org provides stable, widely recognised identifiers for more than 10 million geographical names that can be used as links to further information. For example, http://sws.geonames.org/593116/ identifies the Lithuanian capital Vilnius. Unfortunately these URIs cannot easily be automatically deduced since the URI scheme uses simple numeric codes. Finding a GeoNames identifier for a Location is almost always a manual process. Where such identifiers are known or can be found, however, it is recommended that they be used.


Where the Location Class is used to identify a country, if the geonames URI is not known, the recommendation is to use DBpedia URIs of the form http://dbpedia.org/resource/ISO_3166-1:XX where XX is the ISO 3166 two character code for the country.


The EU's Publication Office diverges from ISO 3166-1 and uses EL and UK for Greece and the United Kingdom respectively. DBpedia sticks to the ISO codes and so the correct URIs for these countries are:

  • - http://dbpedia.org/resource/ISO_3166-1:GR
  • - http://dbpedia.org/resource/ISO_3166-1:GB
    • even when the geographic name is given as EL or UK.


      The use of a URIs has added advantages:

      • - it can be used by automated systems to look up additional data (linked data);
      • - a triple store may store only one copy of the URI, whereas if a string is used, a copy of that string is always stored for each and every person in the database. Thus, in large data sets, the saving on memory capacity and the improvement in transmission efficiency can be substantial.

geographic name Text A geographic name is a proper noun applied to a spatial object.

The INSPIRE Data Specification on Geographical Names provides a detailed model for describing a 'named place', including methods for providing multiple names in multiple scripts. This is beyond what is necessary for the Core Location Vocabulary but, importantly, the concept of a geographic name used here is consistent.


A geographic name is a proper noun applied to a spatial object. Taking the example used in the INSPIRE document (page 15), the following are all valid geographic names for the Greek capital:

  • - "Aθnνa"@gr-Grek (the Greek endonym written in the Greek script)
  • - "Athína"@gr-Latn (the standard Romanisation of the endonym)
  • - "Athens"@en (the English language exonym)
INSPIRE has a detailed (XML-based) method of providing metadata about a geographic name and in XML-data sets that may be the most appropriate method to follow. When using the Core Location Vocabulary in data sets that are not focussed on environmental/geographical data (the use case for INSPIRE), the Code datatype or a simple language identifier may be used to provide such metadata.


The country codes defined in ISO 3166 may be used as geographic names and these are generally preferred over either the long form or short form of a country's name (as they are less error prone). The Publications Office of the European Union recommends the use of ISO 3166-1 codes for countries in all cases except two:

  • - use 'UK' in preference to the ISO 3166 code GB for the United Kingdom;
  • - use 'EL' in preference to the ISO 3166 code GR for Greece.
  • Where a country has changed its name or no longer exists (such as Czechoslovakia, Yugoslavia etc.) use the ISO 3166-3 code.

Person

Description
An individual person who may be dead or alive, but not imaginary.
Usage
The fact that a person in the context of Core Person Vocabulary cannot be imaginary makes person:Person a subclass of both foaf:Person and schema:Person which both cover imaginary characters as well as real people. The Person Class is a subclass of the more general 'Agent' class that encompasses organisations, legal entities, groups etc. - any entity that is able to carry out actions.
Subclass of
Person
Properties
For this entity the following properties are defined: alternative name, birth name, citizenship, country of birth, country of death, date of birth, date of death, family name (surname), full name, gender, given name (forename), identifier, patronymic name, place of birth, place of death, residency.
Property Expected Range Description Usage Codelist
alternative name Text Any name by which an individual is known other than their full name. Many individuals use a short form of their name, a 'middle' name as a 'first' name or a professional name. For example, the British politician and former UN High Representative for Bosnia and Herzegovina, Jeremy John Durham Ashdown, Baron Ashdown of Norton-sub-Hamdon, is usually referred to simply as 'Paddy Ashdown' or 'Lord Ashdown.' It is not the role of the alternative name property to record nick names, pet names or other 'familiar names' that will be of no consequence in public sector data exchange. Furthermore, some individuals have more than one legal name in which case the full name property should be used multiple times. Alternative name gives a means of recording names by which an individual is generally known, or professionally known, even though such names are no more than secondary from a legal point of view.
birth name Text Full name of the Person given upon their birth. All data associated with an individual are subject to change. Names can change for a variety of reasons, either formally or informally, and new information may come to light that means that a correction or clarification can be made to an existing record. Birth names tend to be persistent however and for this reason they are recorded by some public sector information systems. There is no granularity for birth name - the full name should be recorded in a single field.
citizenship Jurisdiction The citizenship relationship links a Person to a Jurisdiction that has conferred citizenship rights on the individual such as the right to vote, to receive certain protection from the community or the issuance of a passport. Citizenship is information needed by many cross-border use cases and is a legal status as opposed to the more culturally-focussed and less well-defined term "nationality". A Person has one, multiple or even no citizenship status. Multiple citizenships are recorded as multiple instances of the citizenship relationship.
country of birth Location The country in which a Person was born. The Location Class has two properties: a Geographic Name and a Geographic Identifier. Plain codes like "DE" should be provided as values for Geographical Names whereas URIs should be provided as value of the Geographical Identifier. Ideally, provide both. Providing a simple country name is problematic and should be avoided whereas using a standardised system that allows the use of a code list for country names has a lot of potential for increasing semantic interoperability. Known diversity that one has to deal with when exchanging country names between different communication partners without relying on an agreed code list are: (a) long form vs. short form of a country name (e.g. Federal Republic of Germany vs. Germany), (b) different languages (Italy vs. Italia), (c) historic name vs. current name (Burma vs. Myanmar), (d) ambiguity of similar sounding countries (Republic of the Congo vs. Democratic Republic of the Congo). The Publications Office of the European Union recommends and uses ISO 3166-1 codes for countries in all cases except two: use 'UK' in preference to the ISO 3166 code GB for the United Kingdom; use 'EL' in preference to the ISO 3166 code GR for Greece. See http://publications.europa.eu/code/en/en-5000500.htm for details of the OPOCE's full list of countries, codes, currencies and more. Where a country has changed its name or no longer exists (such as Czechoslovakia, Yugoslavia etc.) use the ISO 3166-3 code [ISO 3166-3].
country of death Location The country in which a Person died. The Location Class has two properties: a Geographic Name and a Geographic Identifier. Plain codes like "DE" should be provided as values for Geographical Names whereas URIs should be provided as value of the Geographical Identifier. Ideally, provide both. Providing a simple country name is problematic and should be avoided whereas using a standardised system that allows the use of a code list for country names has a lot of potential for increasing semantic interoperability. Known diversity that one has to deal with when exchanging country names between different communication partners without relying on an agreed code list are: (a) long form vs. short form of a country name (e.g. “Federal Republic of Germany” vs. Germany), (b) different languages (Italy vs. Italia), (c) historic name vs. current name (Burma vs. Myanmar), (d) ambiguity of similar sounding countries (“Republic of the Congo” vs. “Democratic Republic of the Congo”). The Publications Office of the European Union recommends and uses ISO 3166-1 codes for countries in all cases except two: use 'UK' in preference to the ISO 3166 code GB for the United Kingdom; use 'EL' in preference to the ISO 3166 code GR for Greece. See http://publications.europa.eu/code/en/en-5000500.htm for details of the OPOCE's full list of countries, codes, currencies and more. Where a country has changed its name or no longer exists (such as Czechoslovakia, Yugoslavia etc.) use the ISO 3166-3 code [ISO 3166-3].
date of birth DateTime The day on which the Person was born.
date of death DateTime The day on which the Person died.
family name (surname) Text A family name is usually shared by members of a family. This attribute also carries prefixes or suffixes which are part of the family name, e.g. "de Boer", "van de Putte", "von und zu Orlow". Multiple family names, such as are commonly found in Hispanic countries, are recorded in the single family name property so that, for example, Miguel de Cervantes Saavedra's family name would be recorded as "de Cervantes Saavedra".
full name Text The full name contains the complete name of a person as one string.

In addition to the content of given name, family name and, in some systems, patronymic name, this can carry additional parts of a person's name such as titles, middle names or suffixes like "the third" or names which are neither a given nor a family name. The full name is the most reliable label for an individual and as such its use is strongly encouraged, irrespective of whether that name is broken down using the more granular elements.


It is anticipated that some systems will only provide or process the full name of a person. Where an individual has more than one full legal name (a relatively rare but not unknown phenomenon), the full name property can be used more than once. In this case, however, the granular name elements should not be used since the intention is that these provide a breakdown of the full name and it will not be clear of which full name this is true. Note that the vocabulary provides an alternative name property. This allows name(s) to be recorded that have no legal status but that nevertheless are the names by which an individual is generally known.


A name usually sticks with a person for a long time period. In some European countries a name may only be changed according to certain laws and life events, e.g. marriage. The name denominates a natural person even if he/she changes their address. Documents like birth certificate or diploma usually don't carry an address but always the name. Thus the name is one of the core attributes. However it is not sufficient to identify a person since there are combinations of very common names like Smith in the UK, Meier in Germany, or Li in China.

gender Concept Gender of the Person. The gender of an individual should be recorded using a controlled vocabulary that is appropriate for the specific context. In some cases the chromosomal or physical state of an individual will be more important than the gender that they express, in others the reverse will be true. What is always important is that the controlled vocabulary used to describe an individual's gender is stated explicitly.
given name (forename) Text A given name, or multiple given names, are the denominator(s) that identify an individual within a family. These are given to a person by his or her parents at birth or may be legally recognised as 'given names' through a formal process. All given names are ordered in one property so that, for example, the given name for Johann Sebastian Bach is "Johann Sebastian".
identifier Identifier An Identifier for the Person. Examples include a national identification number, a student ID, national fiscal number, etc. We also refer to the eIDAS regulation on "electronic identification and trust services" and its mapping to the Core Person Vocabulary.
patronymic name Text Name based on the given name of the Person's father. Patronymic names are important in some countries. Iceland does not have a concept of 'family name' in the way that many other European countries do, for example. Erik Magnusson and Erika Magnusdottir are siblings, both offspring of Magnus, irrespective of his patronymic name. In Bulgaria and Russia, patronymic names are in everyday usage, for example, the "Sergeyevich" in "Mikhail Sergeyevich Gorbachev". Note that patronymic names refer to a father's given name, not the family name inherited from the mother and father as is the case in countries such as Spain and Portugal. Again referring to the example of Miguel de Cervantes Saavedra's, the patronymic name element would be unused.
place of birth Location The Location where the Person was born. The Place of Birth and Place of Death are given using the Location class which is associated via the appropriate relationship. The Location Class has two properties: (1) the geographic name of the place, which is given as a string such as "Amsterdam" or "Valetta" and (2) an identifier, such as a geonames URI http://sws.geonames.org/2759794 (which identifies Amsterdam) or http://sws.geonames.org/2562305 (which identifies Valetta). The use of identifiers is preferred as these are unambiguous, however, public sector data typically uses simple names to record places and this is fully supported.
place of death Location The Location where the Person died. The Place of Birth and Place of Death are given using the Location class which is associated via the appropriate relationship. The Location Class has two properties: (1) the geographic name of the place, which is given as a string such as "Amsterdam" or "Valetta" and (2) an identifier, such as a geonames URI http://sws.geonames.org/2759794 (which identifies Amsterdam) or http://sws.geonames.org/2562305 (which identifies Valetta). The use of identifiers is preferred as these are unambiguous, however, public sector data typically uses simple names to record places and this is fully supported.
residency Jurisdiction Residency typically provides an individual with a subset of the rights of a citizen. A Person has one, multiple or even no residency status. Multiple residencies are recorded as multiple instances of the residency relationship.

Person

Description
A person that is alive, dead, real, or imaginary.
Properties
No properties are defined for this entity.

Changelog w.r.t. previous version

(non-normative)

A changelog describing the (major) changes to the previous version (1.0.0) of the Core Person Vocabulary and the new version that is being proposed in this specification (2.0.0), can be found here.

JSON-LD context

(non-normative)

A reusable JSON-LD context definition for this Core Vocabulary is retrievable at: /context/core-person.jsonld