Abstract

GeoDCAT-AP is an extension of the DCAT application profile for data portals in Europe (DCAT-AP) for describing geospatial datasets, dataset series, and services. Its basic use case is to make spatial datasets, dataset series, and services searchable on general data portals, thereby making geospatial information better findable across borders and sectors. For this purpose, GeoDCAT-AP provides an RDF vocabulary and the corresponding RDF syntax binding for the union of metadata elements of the core profile of ISO 19115:2003 and those defined in the framework of the INSPIRE Directive of the European Union.

This document is the result of the major change release process described in the Change and Release Management Policy for DCAT-AP and was built starting from GeoDCAT-AP version 1.0.1.

Comments and queries should be sent via the issue tracker of the dedicated GitHub repository.

Disclaimer

The views expressed in this document are purely those of the Author(s) and may not, in any circumstances, be interpreted as stating an official position of the European Commission.

The European Commission does not guarantee the accuracy of the information included in this study, nor does it accept any responsibility for any use thereof.

Reference herein to any specific products, specifications, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favouring by the European Commission.

All care has been taken by the author to ensure that s/he has obtained, where necessary, permission to use any parts of manuscripts including illustrations, maps, and graphs, on which intellectual property rights already exist from the titular holder(s) of such rights or from her/his or their legal representative.

Document History

Introduction

This document contains version 2.0.0 of the specification for GeoDCAT-AP, an extension of the DCAT application profile for data portals in Europe (DCAT-AP) [[DCAT-AP]] for describing geospatial datasets, dataset series, and services.

Its basic use case is to make spatial datasets, dataset series, and services searchable on general data portals, thereby making geospatial information better findable across borders and sectors. For this purpose, GeoDCAT-AP provides an RDF vocabulary and the corresponding RDF syntax binding for the union of metadata elements of the core profile of ISO 19115:2003 [[ISO-19115]] and those defined in the framework of the INSPIRE Directive [[INSPIRE-DIR]].

The GeoDCAT-AP specification does not replace the INSPIRE Metadata Regulation [[?INSPIRE-MD-REG]] nor the INSPIRE Metadata technical guidelines [[INSPIRE-MD]] based on ISO 19115 and ISO 19119. Its purpose is to give owners of geospatial metadata the possibility to achieve more by providing the means of an additional implementation through harmonised RDF syntax bindings. Conversion rules to RDF syntax would allow Member States to maintain their collections of INSPIRE-relevant datasets following the INSPIRE Metadata technical guidelines based on [[ISO-19115]] and [[ISO-19119]], while at the same time publishing these collections on [[DCAT-AP]]-conformant data portals. A conversion to an RDF representation allows additional metadata elements to be displayed on general-purposed data portals, provided that such data portals are capable of displaying additional metadata elements. Additionally, data portals may be capable of providing machine-to-machine interfaces where additional metadata could be provided.

Context

With a view to fostering Europe's digital transformation, the European Commission is investing in frameworks and agreements to provide governments and businesses with the appropriate resources to digitalise their services. There are, for instance, building blocks available for creating a single market for data, where data flows freely within the EU and across sectors for the benefit of businesses, researchers and public administrations.

To this direction, the European Commission launched the European Data Strategy [[DataStrategy]], as part of the European Commission's priorities for 2019-2024 in order to make the EU a leader in a data-driven society.

Studies previously conducted on behalf of the European Commission (e.g., [[Vickery]]) shed light on the importance of having data findable and available in machine readable format in order to stimulate its reuse. This vision led to several legislative actions to reduce barriers and promote data sharing, such as the Open Data Directive [[OPENDATA-DIR]], which turned out to be a driving force towards a more transparent and fair access to government data.

With this regard, the wide range of (open) data portals developed by European public administrations is the result of this mission. These Web-based interfaces allow access to data by providing users means to explore a catalogue of datasets. To facilitate the sharing, discovery and re-use of the data beyond the (open) data portal, common agreements for the exchange of catalogues of datasets are needed. These interoperability agreements enable to connect (open) data catalogues into a pan-european catalogue of datasets.

The DCAT-AP application profile specifies the generic agreements for (open) data portals operated by public administrations in Europe. This specification extends DCAT-AP with additional requirements for geospatial data.

The current document is the result of the major semantic change release process described in the Change and Release Management Policy for DCAT-AP [[DCAT-AP-CRMP]] and was built starting from GeoDCAT-AP version 1.0.1 [[GeoDCAT-AP-20160802]], with the purpose of aligning it with DCAT-AP version 2.0.1 [[DCAT-AP-20200608]] and with version 2.0.1 of the INSPIRE Metadata Technical Guidelines [[INSPIRE-MD-20170302]].

This work has been carried out in the context of Action 1.1 – Improving semantic interoperability in European eGovernment systems [[SEMIC]] of the European Commission’s Interoperability solutions for public administrations, businesses and citizens (ISA²) Programme [[ISA2]], and it is the result of the major semantic change release process described in the Change and Release Management Policy for DCAT-AP [[DCAT-AP-CRMP]] and was built starting from GeoDCAT-AP version 1.0.1 [[GeoDCAT-AP-20160802]], with the purpose of aligning it with DCAT-AP version 2.0.1 [[DCAT-AP-20200608]] and with version 2.0.1 of the INSPIRE Metadata Technical Guidelines [[INSPIRE-MD-20170302]].

Scope of the revision

The objective of this work is to produce an updated release of GeoDCAT-AP based on requests for change coming from real-world implementations of the specification and an alignment with [[DCAT-AP-20200608]] and [[INSPIRE-MD-20170302]].

As [[DCAT-AP-20200608]], the Application Profile specified in this document is based on the specification of the Data Catalog Vocabulary (DCAT), originally developed under the responsibility of the Government Linked Data Working Group [[GLD]] at W3C, and significantly revised in 2020 by the W3C Dataset Exchange Working Group [[DXWG]]. DCAT is an RDF [[RDF11-CONCEPTS]] vocabulary designed to facilitate interoperability between data catalogues published on the Web. Additional classes and properties from other well-known vocabularies are re-used where necessary.

The work does not cover implementation issues like mechanisms for exchange of data and expected behaviour of systems implementing the Application Profile other than what is defined in the Conformance Statement in .

The Application Profile is intended to facilitate data exchange and therefore the classes and properties defined in this document are only relevant for the data to be exchanged; there are no requirements for communicating systems to implement specific technical environments. The only requirement is that the systems can export and import data in RDF in conformance with this Application Profile.

A DCAT-AP extension

The GeoDCAT-AP specification is designed as an extension of DCAT-AP in conformance with the guidelines for the creation of DCAT-AP extensions [[DCAT-AP-EG]]. The DCAT-AP Application Profile on which this document is based is the DCAT-AP specification of 8 June 2020 [[DCAT-AP-20200608]]. According to the same principles DCAT-AP is in itself an extension of DCAT.

This dependency must be taken into account while reading and using this specification. When implementing GeoDCAT-AP, these specifications should be consulted first, to fill in any missing gaps in this document.

Terminology used in the Application Profile

A Vocabulary is a specification that determines the semantics of terms (classes and properties) in a broad context of information exchange. The defined terms are higly reusable.

A Controlled Vocabulary is an organised arrangement of words and phrases used to index content and/or to retrieve content through browsing or searching. It typically includes preferred and variant terms and has a defined scope or describes a specific domain [[?GETTY]]. Within this specification, it is also used as a synonym for a codelist.

An Application Profile is a specification that re-uses terms from one or more base standards (vocabularies), adding more specificity by identifying mandatory, recommended and optional elements to be used for a particular application, as well as recommendations for controlled vocabularies to be used.

In the following sections, classes and properties are grouped under headings ‘mandatory’, ‘recommended’ and ‘optional’. These terms have the following meaning:

The meaning of the terms MUST, MUST NOT, SHOULD and MAY in this section and in the following sections are as defined in [[RFC2119]].

In the given context, the term "processing" means that receivers must accept incoming data and transparently provide these data to applications and services. It does neither imply nor prescribe what applications and services finally do with the data (parse, convert, store, make searchable, display to users, etc.).

Classes are classified as ‘Mandatory’ in if they appear as the range of one of the mandatory properties in .

The class ‘Distribution’ is classified as ‘Recommended’ in to allow for cases that a particular Dataset does not have a downloadable Distribution, and in such cases the sender of data would not be able to provide this information. However, it can be expected that in the majority of cases Datasets do have downloadable Distributions, and in such cases the provision of information on the Distribution is mandatory.

All other classes are classified as ‘Optional’ in . A further description of the optional classes is only included as a sub-section in if the Application Profile specifies mandatory or recommended properties for them.

Classes no longer included in this version of the specification, or replaced with other ones, are classified as ‘Deprecated’ in .

Finally, classes and properties added in this Application Profile and extending [[DCAT-AP]] are marked with a prepended “plus” sign (+), and summarised in the reference table in .

Namespaces

The namespace for GeoDCAT-AP is: http://data.europa.eu/930/

The suggested namespace prefix is: geodcat

The Application Profile reuses terms from various existing specifications, following established best practices [[?DWBP]]. The following table indicates the full list of corresponding namespaces used in this document.

Application Profile Overview

This Application Profile extends [[DCAT-AP-20200608]] by including

These extensions are meant to provide a DCAT-AP-conformant representation of geospatial metadata, identified by defining RDF syntax mappings covering the union of the elements in the INSPIRE metadata schema [[INSPIRE-MD-REG]] [[INSPIRE-MD]] and the core profile of [[ISO-19115]], following the criteria illustrated in .

The current specification does not change the set of geospatial metadata elements already covered in its previous version [[GeoDCAT-AP-20160802]], but rather it updates some of the corresponding mappings based on the new classes and properties included in [[DCAT-AP-20200608]], following the release of version 2 of the W3C Data Catalog (DCAT) Vocabulary [[VOCAB-DCAT-2]], and aligns them with version 2.0.1 of the INSPIRE Metadata Technical Guidelines [[INSPIRE-MD-20170302]]. With the same objective, new terms have been defined in the GeoDCAT-AP namespace for the specification of agent roles (see ) and spatial resolution (see ). As a result, some of the classes and properties defined in [[GeoDCAT-AP-20160802]] have been deprecated and replaced by the equivalent ones in [[DCAT-AP-20200608]] and by the newly defined terms.

The list of deprecated classes and properties is summarised in .

shows a UML diagram of the classes and properties included in the Application Profile.

GeoDCAT-AP UML Class Diagram
GeoDCAT-AP UML Class Diagram

Application Profile Classes

Mandatory Classes

Optional Classes

Deprecated Classes

Application Profile Properties per Class

A quick reference table of properties per class is included in . The list of included properties contains all the properties in [[DCAT-AP-20200608]], plus a selection of properties from [[VOCAB-DCAT-2]] and [[DCTERMS]], on which GeoDCAT-AP expresses additional constraints or on which GeoDCAT-AP wants to emphasise their usage.

Examples on the use of these properties, encoded in [[Turtle]], are included in the relevant sections. The examples are also available in [[Turtle]], RDF/XML [[RDF-SYNTAX-GRAMMAR]], and JSON-LD [[JSON-LD11]] from a separate folder.

+Activity

The use of this class in GeoDCAT-AP is illustrated in .

Mandatory properties for Activity

Examples for Activity

shows the use of class prov:Activity to specify the results of a testing activity assessing the conformance of a given resource against a given specification - as illustrated in .

+Address (Agent)

Examples for Address (Agent)

+Address (Kind)

Examples for Address (Kind)

Agent

Mandatory property for Agent

Optional properties for Agent

Examples for Agent

+Attribution

The use of this class in GeoDCAT-AP is illustrated in .

Mandatory properties for Attribution

Deprecated properties for Attribution

Examples for Attribution

Catalogue

Mandatory properties for Catalogue

Optional properties for Catalogue

Deprecated properties for Catalogue

Catalogue Record

Mandatory properties for Catalogue Record

Optional properties for Catalogue Record

Examples for Catalogue Record

Category

Mandatory property for Category

Optional property for Category

Examples for Category

Category Scheme

Mandatory property for Category Scheme

Optional properties for Category Scheme

Examples for Category Scheme

Checksum

Mandatory properties for Checksum

Data Service

Mandatory properties for Data Service

Optional properties for Data Service

Deprecated properties for Data Service

Examples for Data Service

Dataset

Mandatory properties for Dataset

Optional properties for Dataset

Deprecated properties for Dataset

Examples for Dataset

Distribution

Mandatory properties for Distribution

Optional properties for Distribution

Deprecated properties for Distribution

Examples for Distribution

Document

Examples for Document

Identifier

Mandatory property for Identifier

Kind

Optional properties for Kind

Examples for Kind

Licence Document

Optional property for Licence Document

Examples for Licence Document

Location

Optional properties for Location

Examples for Location

Media Type

Examples for Media Type

+Metric

Instances of Metric

An abridged version of the definition of these instances is shown in .

Examples for Metric

Period of Time

Optional properties for Period of Time

Deprecated properties for Period of Time

Examples for Period of Time

is equivalent to , but it uses the [[OWL-TIME]] properties in .

Provenance Statement

Examples for Provenance Statement

+Quality Measurement

Examples for Quality Measurement

Relationship

Mandatory properties for Relationship

Rights Statement

Standard

Optional properties for Standard

Examples for Standard

The following examples show the specification of different types of Standards.

Controlled Vocabularies

Requirements for controlled vocabularies

The following is a list of requirements that were identified for the controlled vocabularies to be recommended in this Application Profile.

Controlled vocabularies SHOULD:

These criteria do not intend to define a set of requirements for controlled vocabularies in general; they are only intended to be used for the selection of the controlled vocabularies that are proposed for this Application Profile.

Controlled vocabularies to be used

In the table below, a number of properties are listed with controlled vocabularies that MUST be used for the listed properties. The declaration of the following controlled vocabularies as mandatory ensures a minimum level of interoperability.

Compared with [[DCAT-AP-20200608]], GeoDCAT-AP makes use of additional controlled vocabularies mandated by [[INSPIRE-MD-REG]], and operated by the INSPIRE Registry - with the only exceptions of the coordinate reference systems register maintained by OGC [[OGC-EPSG]].

For two of these controlled vocabularies, namely the INSPIRE spatial data themes [[INSPIRE-THEMES]] and the ISO topic categories [[INSPIRE-TC]], the GeoDCAT-AP Working Group has defined a set of harmonised mappings to the EU Vocabularies Data Themes [[EUV-THEMES]], in order to facilitate the identification of the relevant theme in [[EUV-THEMES]] for geospatial metadata. The status of this work, along with links to a machine readable representation of the mappings, is documented on a dedicated page in Joinup [[?GeoDCAT-ACV]].

Other controlled vocabularies

In addition to the proposed common vocabularies in , which are mandatory to ensure minimal interoperability, implementers are encouraged to publish and to use further region or domain-specific vocabularies that are available online. While those may not be recognised by general implementations of the Application Profile, they may serve to increase interoperability across applications in the same region or domain. Examples are the full set of concepts in EuroVoc [[?EUV-EUROVOC]], the CERIF standard vocabularies [[?CERIF-VOCS]], the Dewey Decimal Classification [[?DDC]] and numerous other schemes.

For geospatial metadata, the working group has identified the following additional vocabularies:

Licence vocabularies

Concerning licence vocabularies, implementers are encouraged to use widely recognised licences such as Creative Commons licences [[?CC]], and in particular the CC Zero Public Domain Dedication [[?CC0]] or CC-BY Attribution 4.0 International [[?CC-BY]], or the Open Data Commons Public Domain Dedication and License (PDDL) [[?PDDL]]. Often there is applicable legislation or a licency policy in place which determine the set of licences to be used. They may recommend the use of an open government licence such as the UK Open Government Licence [[?UKOGL]].

Further activities in this area are undertaken by the Open Data Institute [[?ODI]] with the Open Data Rights Statement Vocabulary [[?ODRS]] and by the Open Digital Rights Language (ODRL) Initiative [[VOCAB-ODRL]].

Provider requirements

In order to conform to this Application Profile, an application that provides metadata MUST:

For the properties listed in the table in , the associated controlled vocabularies MUST be used. Additional controlled vocabularies MAY be used.

In addition to the mandatory properties, any of the recommended and optional properties defined in MAY be provided.

Recommended and optional classes may have mandatory properties, but those only apply if and when an instance of such a class is present in a description.

Receiver requirements

In order to conform to this Application Profile, an application that receives metadata MUST be able to:

As stated in , "processing" means that receivers must accept incoming data and transparently provide these data to applications and services. It does neither imply nor prescribe what applications and services finally do with the data (parse, convert, store, make searchable, display to users, etc.).

Agent Roles

GeoDCAT-AP supports the 11 agent roles defined in [[INSPIRE-MD-REG]] and [[ISO-19115]] by using two different but not mutually exclusive approaches.

In the first approach, roles are mapped to specific properties. In particular, GeoDCAT-AP makes use of the relevant properties from [[DCAT-AP-20200608]] and [[DCTERMS]], which overall cover 4 of the relevant roles - namely, dcat:contactPoint, dct:creator, dct:publisher, and dct:rightsHolder (not supported in [[DCAT-AP-20200608]]). The remaining 7 are covered by the corresponding properties defined in the GeoDCAT-AP namespace - namely, geodcat:custodian, geodcat:distributor, geodcat:originator, geodcat:principalInvestigator, geodcat:processor, geodcat:resourceProvider, and geodcat:user.

It is important to note that, following [[INSPIRE-MD-REG]] and [[ISO-19115]], in GeoDCAT-AP all these agent roles can be specified on Catalogues, Catalogue Records, Datasets, and Data Services.

The second and more general approach is based on [[PROV-O]], where agent roles are specified via a “qualified attribution” (prov:qualifiedAttribution). More precisely, the relevant Agent is specified via property prov:agent, whereas the role is specified with property dcat:hadRole, which takes as value a skos:Concept describing that role - as those included in the relevant code list operated by the INSPIRE Registry [[INSPIRE-RPR]]. This pattern is illustrated in .

The [[PROV-O]]-based approach is meant to give data providers an extension mechanism to specify, in a harmonised way, any kind of role, and, possibly, to attach additional information to the role and the associated Agent. As such, it MAY be used in combination with or to complement the property-based approach, but it MUST NOT replace it. For instance, it can be used to specify roles not covered by the agent role properties used in GeoDCAT-AP, and/or to specify additional information that cannot be expressed via a property (as the period during which an Agent covered a given role).

More details on these solutions are available in .

Conformance Test Results

Following [[INSPIRE-MD-REG]] and [[ISO-19115]], GeoDCAT-AP supports the specification of test results to assess the degree of conformity of a resource (i.e., a Catalogue, a Dataset, a Data Service) with a given specification (e.g., a standard, a set of implementing rules, a set of data quality or interoperability criteria).

For this purpose, GeoDCAT-AP makes use of two different but not mutually exclusive approaches. More precisely, property dct:conformsTo is used when the test result states that a resource is conformant with a given specification. The more general approach is based on [[PROV-O]], and it specifies conformance test results as a testing activity (prov:Activity) that generates a result (specified with property prov:generated), corresponding to one of the degrees of conformity defined in [[INSPIRE-MD-REG]], and available from the INSPIRE Registry [[INSPIRE-DoC]]. The specification against which the testing activity is carried out is specified via a “qualified association” (prov:qualifiedAssociation), linked to a “plan” (prov:hadPlan) derived from (prov:wasDerivedFrom) the specification itself.

This pattern is illustrated in .

More details on these solutions are available in .

Accessibility and Multilingual Aspects

Accessibility in the context of this Application Profile is limited to information about the technical format of distributions of datasets. The properties dcat:mediaType and dct:format provide information that can be used to determine what software can be deployed to process the data. The accessibility of the data within the datasets needs to be taken care of by the software that processes the data and is outside of the scope of this Application Profile.

Multilingual aspects related to this Application Profile concern all properties whose contents are expressed as strings (i.e. rdfs:Literal) with human-readable text. Wherever such properties are used, the string values are of one of two types:

Wherever values of properties are expressed with either type of string, the property can be repeated with translations in the case of free text and with parallel versions in case of named entities. For free text, e.g. in the cases of titles, descriptions and keywords, the language tag is mandatory.

Language tags to be used with rdfs:Literal are defined by [[BCP47]], which allows the use of the "t" extension for text transformations defined in [[RFC6497]] with the field "t0" [[CLDR]] indicating a machine translation.

A language tag will look like: "en-t-es-t0-abcd", which conveys the information that the string is in English, translated from Spanish by machine translation using a tool named "abcd".

For named entities, the language tag is optional and should only be provided if the parallel version of the name is strictly associated with a particular language. For example, the name ‘European Union’ has parallel versions in all official languages of the union, while a name like ‘W3C’ is not associated with a particular language and has no parallel versions.

For linking to different language versions of associated web pages (e.g. landing pages) or documentation, a content negotiation [[CONNEG]] mechanism may be used whereby different content is served based on the Accept-Languages indicated by the browser. Using such a mechanism, the link to the page or document can resolve to different language versions of the page or document.

All the occurrences of the property dct:language, which can be repeated if the metadata is provided in multiple languages, must have a URI as their object, not a literal string from the [[ISO-639]] code list.

How multilingual information is handled in systems, for example in indexing and user interfaces, is outside of the scope of this Application Profile.

Privacy and security considerations

The GeoDCAT-AP vocabulary supports the attribution of data and metadata to various participants such as resource creators, publishers and other parties or agents, and as such defines terms that may be related to personal information. In addition, it also supports the association of rights and licenses with catalogued Resources and Distributions. These rights and licenses could potentially include or reference sensitive information such as user and asset identifiers as described in [[VOCAB-ODRL]].

Implementations that produce, maintain, publish or consume such vocabulary terms must take steps to ensure security and privacy considerations are addressed at the application level.

Acknowledgements

This work was elaborated by the GeoDCAT-AP Working Group under the ISA² programme.

The ISA² Programme of the European Commission was represented by Pavlina Fragkou and Seth van Hooland. Andrea Perego and Bert van Nuffelen were the editors of the specification.

Contributors: Andreas Kuckartz, Antonio Rotundo, Danny Vandenbroucke, Fabian Kirstein, Franck Cotton, Hannes Reuter, Jakub Klímek, James Passmore, Matthias Palmér, Riccardo Albertoni, Simon Cox.

Quick Reference of Classes and Properties

Classes and properties added by GeoDCAT-AP

Deprecated classes and properties

INSPIRE and ISO 19115 Mappings

Change Log

A full change-log is available on GitHub

Changes since first public working draft of 12 October 2020

Changes since GeoDCAT-AP 1.0.1 (2 August 2016)

Changes since GeoDCAT-AP 1.0.0 (23 December 2015)

Summary of changes to Application Profile classes and properties

The table below summarises the changes applied to the current release of GeoDCAT-AP, including those inherited from [[DCAT-AP]].