SEMIC

Core Person Vocabulary

Status
Semic Recommendation
Published at
2022-04-01
This version
https://semiceu.github.io/Core-Person-Vocabulary/releases/2.00

Summary

The Core Person Vocabulary provides a minimum set of classes and properties for describing a natural person, i.e. the individual as opposed to any role they may play in society or the relationships they have to other people, organisations and property; all of which contribute significantly to the broader concept of identity.

Status of this document

This Core Vocabulary has the status of Semic Recommendation published on 2022-04-01.

Overview

This document describes the usage of the following entities for a correct usage of the Core Vocabulary:
| Address | Agent | Identifier | Jurisdiction | Location | Person | FOAFPerson |

This document describes the usage of the following datatypes for a correct usage of the Core Vocabulary:
| GenericDate |

Entities

Address

Definition
A spatial object that in a human-readable way identifies a fixed location.
Usage
An "address representation" as conceptually defined by the INSPIRE Address Representation data type: "Representation of an address spatial object for use in external application schemas that need to include the basic, address information in a readable way.".

The representation of Addresses varies widely from one country's postal system to another. Even within countries, there are almost always examples of Addresses that do not conform to the stated national standard. However, ISO 19160-1 provides a method through which different Addresses can be converted from one conceptual model to another.

This specification was heavily based on the INSPIRE Address Representation data type. It is noteworthy that if an Address is provided using the detailed breakdown suggested by the properties for this class, then it will be INSPIRE-conformant. To this very granular set of properties, we add two further properties:

- full address (the complete address as a formatted string)
- addressID (a unique identifier for the address).

The first of these allows publishers to simply provide the complete Address as one string, with or without formatting. This is analogous to vCard's label property.

The addressID is part of the INSPIRE guidelines and provides a hook that can be used to link the Address to an alternative representation, such as vCard or OASIS xAL.

This class belongs to Core Location Vocabulary.
Properties
For this entity the following properties are defined: address area, address ID, administrative unit level 1 (country), administrative unit level 2 (country/region/state), full address, locator designator, locator name, post code, post name (city), post office box, thoroughfare.
Property Expected Range Definition Usage Codelist
address area Text The name of a geographic area that groups Addresses. This would typically be part of a city, a neighbourhood or village, e.g. Montmartre. Address area is not an administrative unit.
address ID String A globally unique identifier for each instance of an Address. The concept of adding a globally unique identifier for each instance of an address is a crucial part of the INSPIRE data spec. A number of EU countries have already implemented an ID (a UUID) in their Address Register/gazetteer, among them Denmark. OASIS xAL also includes an address identifier. It is the address Identifier that allows an address to be represented in a format other than INSPIRE whilst remaining conformant to the Core Vocabulary.

The INSPIRE method of representing addresses is very detailed, designed primarily for use in databases of addresses. Whilst data that is published in full conformance with the INSPIRE data structure can be made available using the Core Location Vocabulary the reverse is not true since the Core Vocabulary allows much greater flexibility.

Many datasets that include address data as one piece of information about something else are likely to have that data in simpler formats. These might be tailored to the specific need of the dataset, follow a national norm, or make use of a standard like vCard.

To provide maximum flexibility in the Core Vocabulary, whilst remaining interoperable with INSPIRE Address Guidelines (which EU Member States are obliged to use), the Core Location Vocabulary provides the extra property of full address and makes use of INSPIRE's addressID.
administrative unit level 1 (country) Text The name of the uppermost level of the address, almost always a country. Best practice is to use the ISO 3166-1 code but if this is inappropriate for the context, country names should be provided in a consistent manner to reduce ambiguity. For example, either write 'France' or 'FRA' consistently throughout the dataset and avoid mixing the two. The Country controlled vocabulary from the Publications Office can be reused for this.
administrative unit level 2 (country/region/state) Text The name of a secondary level/region of the address, usually a county, state or other such area that typically encompasses several localities. Values could be a region or province, more granular than level 1.
full address Text The complete address written as a string. Use of this property is recommended as it will not suffer any misunderstandings that might arise through the breaking up of an address into its component parts. This property is analogous to vCard's label property but with two important differences: (1) formatting is not assumed so that, unlike vCard label, it may not be suitable to print this on an address label, (2) vCard's label property has a domain of vCard Address; the fullAddress property has no such restriction. An example of a full address is "Champ de Mars, 5 Avenue Anatole France, 75007 Paris, France".
locator designator String A number or sequence of characters that uniquely identifies the locator within the relevant scope. In simpler terms, this is the building number, apartment number, etc. For an address such as "Flat 3, 17 Bridge Street", the locator is "flat 3, 17".
locator name Text Proper noun(s) applied to the real world entity identified by the locator. The locator name could be the name of the property or complex, of the building or part of the building, or it could be the name of a room inside a building.

The key difference between a locator and a locator name is that the latter is a proper name and is unlikely to include digits. For example, "Shumann, Berlaymont" is a meeting room within the European Commission headquarters for which locator name is more appropriate than locator.
post code String The code created and maintained for postal purposes to identify a subdivision of addresses and postal delivery points. Post codes are common elements in many countries' postal address systems. One of the many post codes of Paris is for example "75000".
post name (city) Text A name created and maintained for postal purposes to identify a subdivision of addresses and postal delivery points. Usually a city, for example "Paris".
post office box String A location designator for a postal delivery point at a post office, usually a number. INSPIRE's name for this is "postalDeliveryIdentifier" for which it uses the locator designator property with a type attribute of that name. This vocabulary separates out the Post Office Box for greater independence of technology. An example post office box number is "9383".
thoroughfare Text The name of a passage or way through from one location to another. A thoroughfare is usually a street, but it might be a waterway or some other feature. For example, "Avenue des Champs-Élysées".

Agent

Definition
Entity that is able to carry out action
Usage
In compliance with the description from FOAF, an Agent is considered as any entity that is able to carry out actions. The Agent class acts as a generic element which can be further specified by implementers for their usages, for example by defining the Person class from the Core Person Vocabulary or Organization from W3C's Organization Ontology as subclasses of Agent. This Person or Organization can then issue a certain Requirement or be concerned by an Evidence provided.
Properties
No properties have been defined for this entity.

Identifier

Definition
A structured reference that identifies an entity.
Usage
The Identifier class is based on the UN/CEFACT class of the same name and is defined under the ADMS namespace.
Properties
For this entity the following properties are defined: date of issue, identifies, issuing authority name, issuing authority URI, notation, scheme name, scheme URI.
Property Expected Range Definition Usage Codelist
date of issue Date The date on which the Identifier was assigned.
identifies Person The entity that is referenced by the Identifier.
issuing authority name Text The name of the agency responsible for issuing the Identifier. Example: "Federal Public Service Interior"@en.
issuing authority URI Agent The reference in the form of a Uniform Resource Identifier to the issuing authority. Example: "https://belgium.be/id/organizations/1233".
notation Literal A string of characters to uniquely identify a concept. Example: "abc-12345-de"^^https://belgium.be/scheme/nationalIDnumber.
scheme name Text Name of the scheme used to construct the identifier. It is useful for names and descriptions that are available in multiple languages. Where this is so, each version of the data should be included and each one associated with the relevant language identifier. RFC 3066 provides a commonly used set of identifiers for natural languages. This is the set recognised by UN/CEFACT and XML Schema.
Languages are represented by two character codes, optionally followed by a locale definition such as "de" meaning German and "de-at" meaning "German as spoken in Austria."
scheme URI Text URI of the scheme used to construct the identifier. It is useful for names and descriptions that are available in multiple languages. Where this is so, each version of the data should be included and each one associated with the relevant language identifier. RFC 3066 provides a commonly used set of identifiers for natural languages. This is the set recognised by UN/CEFACT and XML Schema.
Languages are represented by two character codes, optionally followed by a locale definition such as "de" meaning German and "de-at" meaning "German as spoken in Austria."

Jurisdiction

Definition
The limits or territory within which authority may be exercised.
Usage
The extent or range of judicial, law enforcement, or other authority.
Properties
For this entity the following properties are defined: id, name.
Property Expected Range Definition Usage Codelist
id URI A reference in the form of a Uniform Resource Identifier to the Jurisdiction.
name Text A string of characters that represents a Jurisdiction. The name is simply a string that identifies the Jurisdiction, typically a country, with or without a language tag.

Location

Definition
Identifiable geographic place or named place.
Properties
For this entity the following properties are defined: geographic identifier, geographic name.
Property Expected Range Definition Usage Codelist
geographic identifier URI A reference in the form of a Uniform Resource Identifier to the Location. GeoNames.org provides stable, widely recognised identifiers for more than 10 million geographical names that can be used as links to further information. For example, http://sws.geonames.org/593116/ identifies the Lithuanian capital Vilnius. Unfortunately these URIs cannot easily be automatically deduced since the URI scheme uses simple numeric codes. Finding a GeoNames identifier for a Location is almost always a manual process. Where such identifiers are known or can be found, however, it is recommended that they be used. Where the Location Class is used to identify a country, if the geonames URI is not known, the recommendation is to use DBpedia URIs of the form http://dbpedia.org/resource/ISO\_3166-1:XX where XX is the ISO 3166 two character code for the country. The EU's Publication Office diverges from ISO 3166-1 and uses EL and UK for Greece and the United Kingdom respectively. DBpedia sticks to the ISO codes and so the correct URIs for these countries are: - http://dbpedia.org/resource/ISO\_3166-1:GR - http://dbpedia.org/resource/ISO\_3166-1:GB even when the geographic name is given as EL or UK.
geographic name Text A textual description for a Location. The INSPIRE Data Specification on Geographical Names provides a detailed model for describing a 'named place', including methods for providing multiple names in multiple scripts. INSPIRE's definition is the following: Names of areas, regions, localities, cities, suburbs, towns or settlements, or any geographical or topographical feature of public or historical interest. This is beyond what is necessary for the Core Location Vocabulary but, importantly, the concept of a geographic name used here is consistent.

A geographic name is a proper noun applied to a spatial object. Taking the example used in the INSPIRE document (page 18), the following are all valid geographic names for the Greek capital: - "Aθnνa"@gr-Grek (the Greek endonym written in the Greek script) - "Athína"@gr-Latn (the standard Romanisation of the endonym) - "Athens"@en (the English language exonym) INSPIRE has a detailed (XML-based) method of providing metadata about a geographic name and in XML-data sets that may be the most appropriate method to follow. When using the Core Location Vocabulary in data sets that are not focussed on environmental/geographical data (the use case for INSPIRE), the Code datatype or a simple language identifier may be used to provide such metadata.

The country codes defined in ISO 3166 may be used as geographic names and these are generally preferred over either the long form or short form of a country's name (as they are less error prone). The Publications Office of the European Union recommends the use of ISO 3166-1 codes for countries in all cases except two: - use 'UK' in preference to the ISO 3166 code GB for the United Kingdom; - use 'EL' in preference to the ISO 3166 code GR for Greece. Where a country has changed its name or no longer exists (such as Czechoslovakia, Yugoslavia etc.) use the ISO 3166-3 code.

Person

Definition
A individual human being who may be dead or alive, but not imaginary.
Usage
The fact that a person in the context of Core Person Vocabulary cannot be imaginary makes person:Person a subclass of both foaf:Person and schema:Person which both cover imaginary characters as well as real people. The Person Class is a subclass of the more general 'Agent' class.
Subclass of
Person
Properties
For this entity the following properties are defined: alternative name, birth name, citizenship, country of birth, country of death, date of birth, date of death, domicile, family name (surname), full name, gender, given name (forename), identifier, matronymic name, patronymic name, place of birth, place of death, residency.
Property Expected Range Definition Usage Codelist
alternative name Text Any name by which a Person is known, other than their full name. Many individuals use a short form of their name, a 'middle' name as a 'first' name or a professional name. For example, the British politician and former UN High Representative for Bosnia and Herzegovina, Jeremy John Durham Ashdown, Baron Ashdown of Norton-sub-Hamdon, is usually referred to simply as 'Paddy Ashdown' or 'Lord Ashdown.' It is not the role of the alternative name property to record nick names, pet names or other 'familiar names' that will be of no consequence in public sector data exchange. Furthermore, some individuals have more than one legal name in which case the full name property should be used multiple times. Alternative name gives a means of recording names by which an individual is generally known, or professionally known, even though such names are no more than secondary from a legal point of view.
birth name Text Full name of the Person given upon their birth. The birth name may apply to the surname, the given name, or the entire name. Where births are required to be officially registered, the entire name entered onto a births register or birth certificate may by that fact alone become the person's legal name. https://en.wikipedia.org/wiki/Birth\_name.

All data associated with an individual are subject to change. Names can change for a variety of reasons, either formally or informally, and new information may come to light that means that a correction or clarification can be made to an existing record. Birth names tend to be persistent however and for this reason they are recorded by some public sector information systems. There is no granularity for birth name - the full name should be recorded in a single field.
citizenship Jurisdiction The Jurisdiction that has conferred citizenship rights on the Person such as the right to vote, to receive certain protection from the community or the issuance of a passport. Citizenship is a relationship between an individual and a state to which the individual owes allegiance and in turn is entitled to its protection.

Citizenship is information needed by many cross-border use cases and is a legal status as opposed to the more culturally-focussed and less well-defined term "nationality". A Person has one, multiple or even no citizenship status. Multiple citizenships are recorded as multiple instances of the citizenship relationship.
country of birth Location The country in which the Person was born. The Location Class has two properties: a Geographic Name and a Geographic Identifier. Plain codes like "DE" should be provided as values for Geographical Names whereas URIs should be provided as value of the Geographical Identifier. Ideally, provide both. Providing a simple country name is problematic and should be avoided whereas using a standardised system that allows the use of a code list for country names has a lot of potential for increasing semantic interoperability. Known diversity that one has to deal with when exchanging country names between different communication partners without relying on an agreed code list are: (a) long form vs. short form of a country name (e.g. Federal Republic of Germany vs. Germany), (b) different languages (Italy vs. Italia), (c) historic name vs. current name (Burma vs. Myanmar), (d) ambiguity of similar sounding countries (Republic of the Congo vs. Democratic Republic of the Congo). The Publications Office of the European Union recommends and uses ISO 3166-1 codes for countries in all cases except two: use 'UK' in preference to the ISO 3166 code GB for the United Kingdom; use 'EL' in preference to the ISO 3166 code GR for Greece. See Publications Office list of countries for details of the OPOCE's full list of countries, codes, currencies and more. Where a country has changed its name or no longer exists (such as Czechoslovakia, Yugoslavia etc.) use the ISO 3166-3 code.
country of death Location The country in which a Person died. The Location Class has two properties: a Geographic Name and a Geographic Identifier. Plain codes like "DE" should be provided as values for Geographical Names whereas URIs should be provided as value of the Geographical Identifier. Ideally, provide both. Providing a simple country name is problematic and should be avoided whereas using a standardised system that allows the use of a code list for country names has a lot of potential for increasing semantic interoperability. Known diversity that one has to deal with when exchanging country names between different communication partners without relying on an agreed code list are: (a) long form vs. short form of a country name (e.g. Federal Republic of Germany vs. Germany), (b) different languages (Italy vs. Italia), (c) historic name vs. current name (Burma vs. Myanmar), (d) ambiguity of similar sounding countries (Republic of the Congo vs. Democratic Republic of the Congo). The Publications Office of the European Union recommends and uses ISO 3166-1 codes for countries in all cases except two: use 'UK' in preference to the ISO 3166 code GB for the United Kingdom; use 'EL' in preference to the ISO 3166 code GR for Greece. See Publications Office list of countries for details of the OPOCE's full list of countries, codes, currencies and more. Where a country has changed its name or no longer exists (such as Czechoslovakia, Yugoslavia etc.) use the ISO 3166-3 code.
date of birth GenericDate The point in time on which the Person was born. The date of birth could be expressed as date, gYearMonth or gYear, example:
- 1980-09-16^^xs:date
- 1980-09^^xs:gYearMonth
- 1980^^xs:gYear
date of death GenericDate The point in time on which the Person died. The date of birth could be expressed as date, gYearMonth or gYear, example:
- 1980-09-16^^xs:date
- 1980-09^^xs:gYearMonth
- 1980^^xs:gYear
domicile Address The place that the Person treats as permanent home.
family name (surname) Text The hereditary surname of a family. Usually referring to a group of people related by blood, marriage or adoption. This attribute also carries prefixes or suffixes which are part of the family name, e.g. "de Boer", "van de Putte", "von und zu Orlow". Multiple family names, such as are commonly found in Hispanic countries, are recorded in the single family name property so that, for example, Miguel de Cervantes Saavedra's family name would be recorded as "de Cervantes Saavedra".
full name Text The complete name of the Person as one string. It can be equal to or different from a Person's birth name. The birth name is used as a legal term, whereas the full name just gives a representation of the complete name of a Person.

In addition to the content of given name, family name and, in some systems, patronymic name, this can carry additional parts of a person's name such as titles, middle names or suffixes like "the third" or names which are neither a given nor a family name. The full name is the most reliable label for an individual and as such its use is strongly encouraged, irrespective of whether that name is broken down using the more granular elements.

It is anticipated that some systems will only provide or process the full name of a person. Where an individual has more than one full legal name (a relatively rare but not unknown phenomenon), the full name property can be used more than once. In this case, however, the granular name elements should not be used since the intention is that these provide a breakdown of the full name and it will not be clear of which full name this is true. Note that the vocabulary provides an alternative name property. This allows name(s) to be recorded that have no legal status but that nevertheless are the names by which an individual is generally known.

A name usually sticks with a person for a long time period. In some European countries a name may only be changed according to certain laws and life events, e.g. marriage. The name denominates a natural person even if he/she changes their address. Documents like birth certificate or diploma usually don't carry an address but always the name. Thus the name is one of the core attributes. However it is not sufficient to identify a person since there are combinations of very common names like Smith in the UK, Meier in Germany, or Li in China.
gender Code The identities, expressions and societal roles of the Person The gender of an individual should be recorded using a controlled vocabulary that is appropriate for the specific context. In some cases, the chromosomal or physical state of an individual will be more important than the gender that they express, in others the reverse will be true. What is always important is that the controlled vocabulary used to describe an individual's gender is stated explicitly.
given name (forename) Text The name(s) that identify the Person within a family with a common surname. Usually a first name. Given to a person by his or her parents at birth or legally recognised as 'given names' through a formal process. All given names are ordered in one property so that, for example, the given name for Johann Sebastian Bach is "Johann Sebastian".
identifier Identifier The unambiguous structured reference to the Person. Examples include a national identification number, a student ID, national fiscal number, etc. We also refer to the eIDAS regulation on "electronic identification and trust services" and its mapping to the Core Person Vocabulary.
matronymic name Text Name based on the given name of the Person's mother. It is useful for names and descriptions that are available in multiple languages. Where this is so, each version of the data should be included and each one associated with the relevant language identifier. RFC 3066 provides a commonly used set of identifiers for natural languages. This is the set recognised by UN/CEFACT and XML Schema.
Languages are represented by two character codes, optionally followed by a locale definition such as "de" meaning German and "de-at" meaning "German as spoken in Austria."
patronymic name Text Name based on the given name of the Person's father. Patronymic names are important in some countries. Iceland does not have a concept of 'family name' in the way that many other European countries do, for example. Erik Magnusson and Erika Magnusdottir are siblings, both offspring of Magnus, irrespective of his patronymic name. In Bulgaria and Russia, patronymic names are in everyday usage, for example, the "Sergeyevich" in "Mikhail Sergeyevich Gorbachev". Note that patronymic names refer to a father's given name, not the family name inherited from the mother and father as is the case in countries such as Spain and Portugal. Again referring to the example of Miguel de Cervantes Saavedra's, the patronymic name element would be unused.
place of birth Location The Location where the Person was born. The Place of Birth and Place of Death are given using the Location class which is associated via the appropriate relationship. The Location Class has two properties: (1) the geographic name of the place, which is given as a string such as "Amsterdam" or "Valetta" and (2) an identifier, such as a geonames URI http://sws.geonames.org/2759794 (which identifies Amsterdam) or http://sws.geonames.org/2562305 (which identifies Valetta). The use of identifiers is preferred as these are unambiguous, however, public sector data typically uses simple names to record places and this is fully supported.
place of death Location The Location where the Person died. The Place of Birth and Place of Death are given using the Location class which is associated via the appropriate relationship. The Location Class has two properties: (1) the geographic name of the place, which is given as a string such as "Amsterdam" or "Valetta" and (2) an identifier, such as a geonames URI http://sws.geonames.org/2759794 (which identifies Amsterdam) orhttp://sws.geonames.org/2562305 (which identifies Valetta). The use of identifiers is preferred as these are unambiguous, however, public sector data typically uses simple names to record places and this is fully supported.
residency Jurisdiction Jurisdiction where the Person has their dwelling. A Person's fixed, permanent, and principal home for legal purposes, the place (especially the house) in which a person officially lives or resides. A Person has one, multiple or even no residency status. Multiple residencies are recorded as multiple instances of the residency relationship.

Person

Definition
A person that is alive, dead, real, or imaginary.
Subclass of
Agent
Properties
No properties have been defined for this entity.

Datatypes

GenericDate

Definition
The GenericDate data type is the union of xs:date, xs:gYearMonth and xs:gYear
Properties
There are no properties defined for this datatype.

Changelog w.r.t. previous version

(non-normative)

A changelog describing the (major) changes to the previous version (1.0.0) of the Core Person Vocabulary and the new version that is being proposed in this specification (2.0.0), can be found here.

UML representation

(non-normative)

The UML representation from which this Core Vocabulary has been build is available here.

RDF representation

(non-normative)

A reusable RDF representation (in turtle) for this Core Vocabulary is retrievable here.
This RDF file contains only the terminology for which the URI is minted in the Core Vocabularies domain http://data.europa.eu/m8g. Terms that are mapped on an existing URI (hence reused from other vocabularies) are not included.

JSON-LD context

(non-normative)

A reusable JSON-LD context definition for this Core Vocabulary is retrievable here.

SHACL template

(non-normative)

A reusable SHACL template for this Core Vocabulary is retrievable here.