Introduction

Context

In light of the growing importance of data, the European Commission has adopted an implementing act focused on High-Value Datasets on 21 December 2022 [[HVD]]. The implementing regulation groups the datasets in a list of six high-value datasets thematic categories: geospatial, earth observation and environment, meteorological, statistics, companies and company ownership, and mobility. The Portal for European Data [[DEU]] has published an easy-to-read overview.

Scope

This document provides the guidelines on how to use DCAT-AP for a dataset that is subject to the requirements imposed by the High-Value dataset implementing regulation (HVD IR). The document is called the "usage guidelines of DCAT-AP for High-Value Datasets", in short DCAT-AP HVD.

To understand these guidelines, it is important to realise that the HVD IR applies to a subset of all the datasets that are collected by (Open) Data Portals in Europe. A single catalogue contains catalogued resources which are within and outside scope of the HVD IR.

This document supports the need for a common usage of DCAT-AP for catalogued resources within the scope of the High-Value Dataset implementing regulation. When conforming to these guidelines, a dataset within the scope of the regulation will have satisfied the minimum metadata requirements to be included in the mandatory reporting of the regulation. It, however, does not mean an automated compliance with the regulation because certain aspects are beyond DCAT-AP. For instance, the regulation imposes certain data aspects to be present, or the licencing requires to be more permissive than CC-BY 4.0. Such requirements cannot be verified by just inspecting the DCAT-AP description, but the DCAT-AP description will assist in the assessment.

DCAT-AP HVD supports the implementers of the regulation in their assessment of their state of play. When applying the guidelines to the metadata of their datasets, which are in scope of the regulation, the necessary attention will be raised to drive towards conformance. At the same time, this effort will create an immediate benefit for the European citizens and businesses. Any improvement of the metadata will immediately flow throughout the European network of (Open) Data Portals and thus increase the level of metadata quality.

Meeting minutes

Status

This application profile has the status Candidate Recommendation published at 2023-06-19.

Information about the process and the decisions involved in the creation of this specification are consultable at the Changelog.

License

Copyright © 2023 European Union. All material in this repository is published under the license CC-BY 4.0, unless explicitly otherwise mentioned.

Conformance Statement

In order for applications to conform to DCAT-AP HVD, it MUST conform to DCAT-AP. In addition, the application must conform to the mentioned constraints and usage guidelines following similar conformance statements as specifief in DCAT-AP.

Provider requirements

In order to conform to this Application Profile, an application that provides metadata MUST: The application of the controlled vocabularies as described in section [[[#controlled-vocs]]].

Receiver requirements

In order to conform to this Application Profile, an application that receives metadata MUST be able to:
  • Process information for all classes and properties specified in section [[[#quick-reference]]].
  • Process information for all controlled vocabularies specified in section [[[#controlled-vocs]]].
  • "Processing" means that receivers must accept incoming data and transparently provide these data to applications and services. It does neither imply nor prescribe what applications and services finally do with the data (parse, convert, store, make searchable, display to users, etc.).

    Terminology

    An Application Profile is a specification that reuses terms from one or more base standards, adding more specificity by identifying mandatory, recommended and optional elements to be used for a particular application, as well as recommendations for controlled vocabularies to be used.

    An Annex to an Application Profile is a specification that precises the use of some aspects of the Application Profile for a specific context.

    This specification uses the following prefixes to shorten the URIs for readibility.
    PrefixNamespace IRI
    admshttp://www.w3.org/ns/adms#
    dcathttp://www.w3.org/ns/dcat#
    dcataphttp://data.europa.eu/r5r/
    dcthttp://purl.org/dc/terms/
    dctypehttp://purl.org/dc/dcmitype/
    foafhttp://xmlns.com/foaf/0.1/
    locnhttp://www.w3.org/ns/locn#
    owlhttp://www.w3.org/2002/07/owl#
    rdfhttp://www.w3.org/1999/02/22-rdf-syntax-ns#
    rdfshttp://www.w3.org/2000/01/rdf-schema#
    skoshttp://www.w3.org/2004/02/skos/core#
    vcardhttp://www.w3.org/2006/vcard/ns#
    xsdhttp://www.w3.org/2001/XMLSchema#

    Overview

    DCAT-AP HVD is an annex to DCAT-AP. It describes additional usage of the DCAT-AP to satisfy the High-Value Dataset implementing regulation (HVD IR). In this document only the additional information, that is required for the catalogued resources which are within scope of the regulation, is included. In any other case, the guidelines of DCAT-AP itself are applicable.

    Application profile diagram

    An overview of DCAT-AP HVD is shown by the UML diagram below. The UML diagram illustrates the specification described in this document. For readability purposes the representation has been condensed as follows:

    The cardinalities and qualifications are included in the figure.

    For readibility of this document as an annex to DCAT-AP, the core relationships between classes are included.

    This document describes the usage of the following main entities for a correct usage of the Application Profile:
    | Agent | Catalogue | Catalogue Record | Catalogued Resource | Data Service | Dataset | Distribution | Kind | Licence Document |

    The main entities are supported by:
    | Concept | Document | Identifier | Legal Resource | Literal | Resource | Rights statement | Standard |

    Main Entities

    The main entities are those that form the core of the Application Profile.

    Agent

    Definition
    Any entity carrying out actions with respect to the (Core) entities Catalogue, Datasets, Data Services and Distributions.
    Reference in DCAT
    Link
    Usage Note
    If the Agent is an organisation, the use of the Organization Ontology is recommended.
    Properties
    This specification does not impose any additional requirements to properties for this entity.

    Catalogue

    Definition
    A catalogue or repository that hosts the Datasets or Data Services being described.
    Reference in DCAT
    Link
    Properties
    For this entity the following properties are defined: applicable legislation , dataset , licence , publisher , record , service .
    Property Range Card Definition Usage DCAT
    [o] applicable legislation Legal Resource 1 The legislation that mandates the creation or management of the Catalogue. For HVD the value must include the ELI http://data.europa.eu/eli/reg_impl/2023/138/oj.
    As multiple legislations may apply to the resource the maximum cardinality is not limited.
    [o] dataset Dataset 0..* A Dataset that is part of the Catalogue. As empty Catalogues are usually indications of problems, this property should be combined with the next property service to implement an empty Catalogue check. Link
    [o] licence Licence Document 0..1 A licence under which the Catalogue can be used or reused. Link
    [o] publisher Agent 1 An entity (organisation) responsible for making the Catalogue available. Link
    [o] record Catalogue Record 0..* A Catalogue Record that is part of the Catalogue Link
    [o] service Data Service 0..* A site or end-point (Data Service) that is listed in the Catalogue. As empty Catalogues are usually indications of problems, this property should be combined with the previous property dataset to implement an empty Catalogue check. Link

    Catalogue Record

    Definition
    A description of a Dataset’s entry in the Catalogue.
    Reference in DCAT
    Link
    Properties
    For this entity the following properties are defined: primary topic .
    Property Range Card Definition Usage DCAT
    [o] primary topic Catalogued Resource 1 A link to the Dataset, Data service or Catalog described in the record. A catalogue record will refer to one entity in a catalogue. This can be either a Dataset or a Data Service. To ensure an unambigous reading of the cardinality the range is set to Catalogued Resource. However it is not the intend with this range to require the explicit use of the class Catalogued Record. As abstract class, an subclass should be used. Link

    Catalogued Resource

    Definition
    Resource published or curated by a single agent.
    Reference in DCAT
    Link
    Usage Note
    For DCAT-AP, the class is considered an abstract notion.
    Properties
    This specification does not impose any additional requirements to properties for this entity.

    Data Service

    Definition
    A collection of operations that provides access to one or more datasets or data processing functions.
    Reference in DCAT
    Link
    Subclass of
    Catalogued Resource
    Properties
    For this entity the following properties are defined: applicable legislation , contact point , endpoint description , endpoint URL , HVD category , licence , quality of Service , rights , serves dataset , title .
    Property Range Card Definition Usage DCAT
    [o] applicable legislation Legal Resource 1..* The legislation that mandates the creation or management of the Data Service. For HVD the value must include the ELI http://data.europa.eu/eli/reg_impl/2023/138/oj.
    As multiple legislations may apply to the resource the maximum cardinality is not limited.
    [o] contact point Kind 1..* Contact information that can be used for sending comments about the Dataset. Link
    [o] endpoint description Resource 0..* A description of the services available via the end-points, including their operations, parameters etc. The property gives specific details of the actual endpoint instances, while dct:conformsTo is used to indicate the general standard or specification that the endpoints implement. Link
    [o] endpoint URL Resource 1..* The root location or primary endpoint of the service (an IRI). The endpoint URL should be persistent. This means that publishers should do everything in their power to maintain the value stable and existing. Link
    [o] HVD category Concept 1..* The HVD category to which this Data Service belongs.
    [o] licence Licence Document 1 A licence under which the Data service is made available. Link
    [o] quality of Service Document 1..* A web page that provides details the quality of service the Data Service provides and/or additional information. Quality of service covers a broad spectrum of aspects. The HVD regulation does not list any mandatory topic.
    [o] rights Rights statement 0..* A statement that specifies rights associated with the Distribution.
    [o] serves dataset Dataset 1..* This property refers to a collection of data that this data service can distribute. Link
    [o] title Literal 1..* A name given to the Data Service. This property can be repeated for parallel language versions of the name. Link

    Dataset

    Definition
    A conceptual entity that represents the information published.
    Reference in DCAT
    Link
    Subclass of
    Catalogued Resource
    Properties
    For this entity the following properties are defined: applicable legislation , conforms to , contact point , dataset distribution , HVD Category , other identifier , publisher .
    Property Range Card Definition Usage DCAT
    [o] applicable legislation Legal Resource 1..* The legislation that mandates the creation or management of the Data Service. For HVD the value must include the ELI http://data.europa.eu/eli/reg_impl/2023/138/oj.
    As multiple legislations may apply to the resource the maximum cardinality is not limited.
    [o] conforms to Standard 0..* An implementing rule or other specification. The provided information should enable to the verification whether the detailed information requirements by the HVD is satisfied. Link
    [o] contact point Kind 0..* Contact information that can be used for sending comments about the Dataset. Link
    [o] dataset distribution Distribution 1..* An available Distribution for the Dataset. Link
    [o] HVD Category Concept 1..* The HVD category to which this Dataset belongs.
    [o] other identifier Identifier 0..* A secondary identifier of the Dataset, such as MAST/ADS17, DataCite18, DOI19, EZID20 or W3ID21. It is recommended to include all identifiers that the catalogue knows about this dataset. This allows others to find datasets more easily back. Link
    [o] publisher Agent 0..1 An entity (organisation) responsible for making the Dataset available. Link

    Distribution

    Definition
    A physical embodiment of the Dataset in a particular format.
    Reference in DCAT
    Link
    Usage Note
    Bulk downloads should be encoded as a Distribution.
    Properties
    For this entity the following properties are defined: access service , access URL , applicable legislation , HVD Category , licence , linked schemas .
    Property Range Card Definition Usage DCAT
    [o] access service Data Service 0..* A data service that gives access to the distribution of the dataset Link
    [o] access URL Resource 1..* A URL that gives access to a Distribution of the Dataset. The resource at the access URL may contain information about how to get the Dataset. Link
    [o] applicable legislation Legal Resource 1..* The legislation that mandates the creation or management of the Data Service. For HVD the value must include the ELI http://data.europa.eu/eli/reg_impl/2023/138/oj.
    As multiple legislations may apply to the resource the maximum cardinality is not limited.
    [o] HVD Category Concept 0..* The HVD category to which this Distribution belongs.
    [o] licence Licence Document 1 A licence under which the Distribution is made available. Link
    [o] linked schemas Standard 0..* An established schema to which the described Distribution conforms. Link

    Kind

    Definition
    A description following the vCard specification, e.g. to provide telephone number and e-mail address for a contact point.
    Usage Note
    Note that the class Kind is the parent class for the four explicit types of vCards (Individual, Organization, Location, Group).
    Properties
    For this entity the following properties are defined: contact page , email .
    Property Range Card Definition Usage DCAT
    [o] contact page Resource 1 A webpage that either allows to make contact (i.e. a webform) or the information contains how to get into contact. 
    [o] email Resource 1 A email address via which contact can be made.

    Licence Document

    Definition
    A legal document giving official permission to do something with a resource.
    Usage Note
    The HVD regulation requires a machine readable representation of a Licence. The minimal data model to describe a licence Document is beyond this specification. Nevertheless in [[[#c3]]] some suggestions are made.
    Properties
    This specification does not impose any additional requirements to properties for this entity.

    Supportive Entities

    The supportive entities are supporting the main entities in the Application Profile. They are included in the Application Profile because they form the range of properties.

    Concept

    Definition
    Properties
    This specification does not impose any additional requirements to properties for this entity.

    Document

    Definition
    A textual resource intended for human consumption that contains information, e.g. a web page about a Dataset.
    Properties
    This specification does not impose any additional requirements to properties for this entity.

    Identifier

    Definition
    An identifier in a particular context, consisting of the string that is the identifier; an optional identifier for the identifier scheme; an optional identifier for the version of the identifier scheme; an optional identifier for the agency that manages t
    Properties
    This specification does not impose any additional requirements to properties for this entity.

    Legal Resource

    Definition
    This class represents the legislation, policy or policies that lie behind the Rules that govern the service.
    Usage Note
    The definition and properties of the Legal Resource class are aligned with the ontology included in "Council conclusions inviting the introduction of the European Legislation Identifier (ELI)". For describing the attributes of a Legal Resource (labels, preferred labels, alternative labels, definition, etc.) we refer to the ELI ontology. In this data specification the use is restricted to instances of this class that follow the ELI URI guidelines.
    Properties
    This specification does not impose any additional requirements to properties for this entity.

    Literal

    Definition
    A literal value such as a string or integer; Literals may be typed, e.g. as a date according to xsd:date. Literals that contain human-readable text have an optional language tag as defined by BCP 4715.
    Properties
    This specification does not impose any additional requirements to properties for this entity.

    Resource

    Definition
    Anything described by RDF.
    Properties
    This specification does not impose any additional requirements to properties for this entity.

    Rights statement

    Definition
    A statement about the intellectual property rights (IPR) held in or over a resource, a legal document giving official permission to do something with a resource, or a statement about access rights.
    Properties
    This specification does not impose any additional requirements to properties for this entity.

    Standard

    Definition
    A standard or other specification to which a Dataset or Distribution conforms.
    Properties
    This specification does not impose any additional requirements to properties for this entity.

    Controlled Vocabularies

    The usage of controlled vocabularies in DCAT-AP HVD conforms and extends the usage defined by DCAT-AP. In addition, the following controlled vocabularies are defined to be used:

    Controlled vocabularies to be used

    In the table below, a number of properties are listed with controlled vocabularies that MUST be used for the listed properties. The declaration of the following controlled vocabularies as mandatory ensures a minimum level of interoperability.
    Property URIUsed for ClassVocabulary nameVocabulary URIUsage note
    dcatap:hvdCategoryDatasetEU Vocabularies HVD CategoriesTBD

    Licence controlled vocabularies

    The HVD regulation imposes quality requirements on the published legal conditions. In line with the generic DCAT-AP guidelines for publishing controlled vocabularies, a licence controlled vocabulary SHOULD:

    Mapping the HVD regulation to DCAT-AP

    This section provides recommendations how to encode descriptions required by the HVD implementation regulation (HVD IR) as a DCAT-AP metadata structure. Each topic is introduced first from the perspective of the HVD IR, followed by an assessment of the topic on the use of DCAT-AP. The selected interpretation is further elaborated, where appropriate, with implementation guidelines.

    Alignment of terminology

    The HVD implementation regulation uses the terms Dataset, Bulk Download and API.

    In the context of DCAT-AP, a HVD Dataset is mapped on a Dataset, Bulk Download on a Distribution and API on a Data Service. To be conformant with the use of DCAT-AP in the context of the HVD regulation, this mapping MUST be followed.

    To make the text easier to read, with a HVD Dataset we mean a Dataset in scope of the HVD implementing regulation. The same pattern is applied to other entities.

    In scope of HVD regulation

    A Dataset is a HVD dataset if and only if a MS has included it in its reporting. The HVD IR defines high-value datasets. It may be possible that the same definition applies to multiple entities. In that case, a Member State should select the most appropriate one, according to the rules in the regulation. If the Member State decides to include multiple entities in the reporting, the requirements set out in the HVD regulation will apply to all these entities. Also, if a Member State decides to include a dataset in the HVD reporting for which inclusion is not mandatory, then the requirements of the HVD regulation will apply. The report is an engagement of the Member State to the European data community to sustain those datasets.

    If a re-user discovers a dataset that seems to be in scope of the HVD regulation, then the responsible MS should be able to provide an explanation why it is not included in the reporting. One response to this question could be by providing the relevant HVD dataset corresponding to that dataset.

    Denoting a HVD dataset

    Each entity (Dataset, Data Service, Distribution, Catalogue) that is identified by a MS in scope of the HVD IR should provide the European Legislation Identifier (ELI) http://data.europa.eu/eli/reg_impl/2023/138/oj of the HVD regulation for the property applicable legislation. For the reporting, a Member State can provide a catalogue containing all elements that are within scope for the reporting of the HVD IR. In that case the catalogue should also set the value for the property applicable legislation to the ELI of the HVD.

    Special cases

    When a Dataset is within scope of HVD, it is not mandatory that all distributions are within scope of HVD. Existing metadata remains valid. Our recommendations ensure that existing metadata (specified in DCAT-AP or other frameworks like INSPIRE) remains valid. Becoming a Dataset in scope of HVD is an additional operation.

    When a Data Service offers access to multiple datasets and this Data Service fulfils the HVD requirements (e.g. the HVD API for that dataset) for a HVD then the HVD requirements apply only to that HVD. It is common that the same API service endpoint (denoted by a dcat:DataService) provides access to multiple datasets. As such, it is to be expected that only some of the datasets are within scope of HVD. Like for Distributions, the HVD does not enforce that all Datasets associated with a Data Service must be in scope of HVD. Nevertheless, it must be noted that the HVD requirements on a Data Service might indirectly impact the other datasets that are available through the same data service, because a Data Service will sharee the operational and service level requirements for all its associated datasets.

    HVD data category

    The HVD IR defines six thematic data categories: geospatial, earth observation and environment, meteorological, statistics, companies and company ownership, and mobility. A new property HVD category is introduced to indicate the HVD category to which an entity belongs. The code list will be created and maintained by the Publications Office. An entity may belong to more than one data category.

    Identifiers

    In general, the requirements of the HVD IR are satisfied when the best practices of DCAT-AP on identifiers are followed. According to HVD regulation the identifiers provided in the report should be an online reference to the metadata.

    In short these are:

    In practice, multiple identifiers may have been assigned to a Dataset. It is recommended to select a master identifier and use this one to implement the HVD regulation. In general, harvesters and portals are advised to use and promote this identifier as the identifier for the HVD Dataset. In addition it is recommended to augment the list of alternative identifiers (adms:identifier) with the encountered identifiers. These identifier processing recommendations are made to ensure that the information in usages like the HVD reporting (i.e. a reference to a dataset) is interconnected with the real data.

    Persistent identifiers

    The HVD regulation requires as part of the reporting requirements (article 5.3), that Licensing Conditions and APIs have persistent links.

    Persistence means that, for these entities, Member States take the responsibility to maintain the real world entity indefinitely and additionally reduce the accessibility challenge by maintaining the same name for that real world entity indefinitely. Thus for the entities that MSs include in the reporting and for which the reporting requires a persistent link, a MS makes a persistent commitment.

    As DCAT-AP is a Semantic Web data specification, persistence is associated with the use persistent URIs (PURIs) for the metadata descriptions. A general advice for DCAT-AP implementers is to use PURIs for all entities, but in particular for Datasets and Data Services. The practice, though, shows that this is not commonly applied, and therefore we have proposed guidelines on identifiers [[IdentifierGuidelines]]. Implementers of the HVD regulation are advised to read these guidelines to understand how identity might or might not be preserved from one data portal to another, and take the appropriate actions.

    In article 5.3 of the HVD IR, the broad term Licensing Conditions is used, while in the other parts of the regulation the term Licence is used. DCAT-AP provides several means to express legal information, notably the properties licence (dct:license) and rights (dct:rights). This may lead to questions whether rights are included by the reporting requirement. As the final objective is to provide a trusted legal statement, it is considered that the requirement for a persistent link applies to rights too.

    The reporting requirement for a persistent link for APIs is ambiguous from the perspective of DCAT-AP. In DCAT-AP, there is the identifier for the Data Service, i.e. the description about the API, and the property endpoint URL, which is the technical endpoint via which the data exchange will happen. As the HVD regulation does not specify precisely which case it covers, both are considered in scope. That means that each Dataset Service has a PURI and that the endpoint URL is persistent. Thus when an API is moved to a new platform (e.g. from a local API gateway to an organisation-wide one), the original endpoint URL must be maintained, and also the metadata management must be maintained.

    DCAT-AP does not impose persistent identification of an endpoint URL. It, however, expects a life cycle management of the API through metadata. For that, DCAT-AP includes a few properties for life cycle management.

    For example, consider an API which is at the end of its lifecycle. According to DCAT-AP, the PURI of the Data Service could get the status ‘deprecated’ and the endpoint URL could be made void when it is taken offline. Any data portal user would understand that this Data Service should not to used anymore. If the metadata is augmented with the information about the successor of the Data Service, the data portal owner can be guided to the new Data Service.

    The impact for a user of the endpoint URL is higher: systems might get broken when the endpoint URL is taken offline. This situation is the result of a shared responsibility: either the publishers did not apply a decent life cycle management, or the users did not inform the publisher about their critical dependency. The enforcement of a persistent link for the endpoint URL will reduce the occurrence of such cases, but it will not make them disappear. Because of this, even if the API gives access to open data users that are dependent on the API are advised to inform the publisher. But also, publishers must improve their life cycle management for these APIs so that re-users get the right information and can take the change of the endpoint of the API into account in their roadmaps.

    In summary, the recommendation is to have persistency for both aspects of the API: its metadata identifier as its endpoint URL.

    Legal information

    The HVD regulation requires a high level of metadata quality for legal information. The information should be provided in machine and human readable format, using a persistent link. Furthermore, it should be possible to investigate whether the legal conditions are equal or more permissive than the reference CC-BY 4.0.

    Despite these strong requirements, the HVD regulation does not alter the general use of DCAT-AP for legal information. The HVD requirements extend or precise how the legal information technically should be provided. In this documentation it is assumed that legal information corresponds to licences as rights expressions. In currently allowed practice, licence information may be supplied by a collection of rights statements, in cases that national legislation does not allow to provide a licence document. This is compatible with the HVD regulation, and in that case, the HVD requirements will also apply to the rights statements. HVD does not force to adapt the current DCAT-AP principle to indicate the legal information at the most precise level in the metadata description: i.e., Data Service and Distribution, therefore this principle is maintained.

    Catalogue owners are advised to assess the legal information provided by the publishers according to flows in the figures below. For instance, if a publisher provides licence information referring to a licence document made online accessible by the publisher itself, then the publisher of that information must implement the HVD quality requirements for licence documents. The decision trees in the figures allow to assess whether or not additional effort has to be performed.

    The decision tree for licence information.
    The decision tree for licence information.

    To support the assessment whether the assigned legal conditions are equal or more permissive than the reference CC-BY 4.0, the recommendation is to augment the machine-readable publication of MS-specific or publisher-specific licences with mapping information on the Licence NAL. It is recommended to use, in order of preference, the SKOS mapping properties, owl:sameAs or rdfs:seeAlso, to express this mapping.

    In the reporting requirements of the HVD regulation, the notion terms of use is used. It has been agreed, by the Working Group for this documentation, that providing terms of use information is the same as providing legal information for a Data Service.

    Contact Point

    The HVD regulation request a contact point for APIs.

    This requirement is implemented as the following recommendation. A contact point is mandatory for HVD Data Services and recommended for HVD Datasets either in the form of a (persistent) email address or a link to a contact form on a webpage, e.g. to contact a service desk.

    Specific data requirements

    The HVD implementation regulation describes, in its Annex, precisely the data elements that should be provided for a HVD dataset. A HVD dataset must conform to the rules defined in the HVD IR.

    It is recommended to provide a reference to a public document (for instance: data standards) that describes the internals of the Dataset (or Distribution) using the property conforms to. This ensures that the information is made publicly accessible for reusers. Additionally, it can be used by experts to verify if the Dataset matches the HVD requirements.

    Reporting

    The HVD regulation requires EU Member States (reporter) to report the list of HVD datasets. This objective can be achieved by providing to the European Commission (reporting authority) a catalogue containing all the metadata about all the Datasets, Distributions and Data Services that are in scope of the HVD. When a MS has all metadata collected in a national data catalogue, then the report can be created by querying the national catalogue for all entities that have http://data.europa.eu/eli/reg_impl/2023/138/oj as applicable legislation. If the MS has assigned persistent identifiers, as explained in [[[#c5]]], to the metadata entities, then it is even possible for the reporting authority to collect the metadata by querying the MS national catalogue or even the Portal for European Data [[DEU]]. This potential shows that this documentation and consensus reached during the Working Group for this specification aid in reducing the aggregation effort at MS level while at the same time re-enforcing the existing metadata practices. However, the used approach (format and process) for the reporting is beyond scope of this document.

    Example

    In this section we illustrate the recommendations for DCAT-AP for the HVD implementing regulation. The examples in this section are fictitious; their sole purpose is to illustrate the metadata.

    Datasets in scope of HVD

    Consider that a dataset "The population of bees" is within scope of the HVD while another dataset "The population of wasps" is not. Both datasets however, are in scope of the INSPIRE directive.

    Example 1 - Bees and wasps population datasets

    Both datasets are published by the Environment Agency of the EU Memberstate ExampleMS using a persistent identifier.

    Example 2 - MS dataset

    The datasets are published on the EU Memberstates national data portal https://dataportal.exampleMS.gov. This portal provides another identifier to the datasets. Because that new identifier is not the master identifier, the portal avoids this by sharing this identifier in its published DCAT-AP catalogue by listing it as an additional identifier.

    Example 3 - MS dataset with 2 identifiers

    Bulk downloads for HVD datasets

    The datasets are downloadable in various formats and level of detail. In our example, the data is available in two formats: RDF and ESRI shapefile format. According to the HVD regulation the datasets must minimally be available in bulk download with the granularity of 50 square kilometers and with a bi-yearly update frequency. It must also be available in an open format for geospatial data. Based on these requirements, the publisher of the dataset decides to indicate that the shape-based distribution is a HVD bulk download.

    Example 4 - MS dataset with 2 distributions

    The HVD regulation also specifies that the dataset should at least provide information about the number of bees, the calculation method, the amount of honey being harvested and the number of beekeepers active in the area. The publisher describes the data semantically using an application profile, and provides detailed data schema documentation for each distribution.

    Example 5 - MS dataset conform to a profile

    The HVD dataset is accessible via an API

    According to the HVD regulation, the "bee population" dataset must be made available via an API. The dataset publisher has an API platform deployed, via which data users have access to realtime data. This API platform supports all datasets of the publisher.

    Example 6 - MS dataset with data service

    Because the API platform is provided as the API for the "bee population" dataset, the HVD implementing regulation requirements apply. This means that the endpoint URL must be persistent. The publisher should perform maximal effort to keep the endpoint URL stable. For instance, deploying a new API platform or changing organisation names should not impact the endpointURL.

    To provide information about the use of the API platform, the publisher provides OpenAPI technical documentation and an SLA to document the quality of service.

    Example 7 - MS data service with OpenAPI and SLA

    To address any questions by the users the publisher operates a service desk.

    Example 8 - MS data service with publisher service desk

    Expressing legal conditions

    The Member State (MS) imposes, via its legislation, to use national data licences for public bodies. Therefore, the dataset publisher is required to use one of them. As support to the community, the MS has published the data licences as a SKOS taxonomy, using persistent URIs. For the "wasp population" dataset, a restrictive licence is chosen because the data is based on information that has commercial rights including fees. The "bee population" dataset is shared with a very permissive licence.

    Example 9 - MS dataset distributions with different licences

    In order to support the assessment of the used licences, the MS maps the licences to the NAL Licences [[NAL-Licence]].

    Example 10 - Mapping licences

    The HVD regulation requires the licence for the Bee population dataset is at least as permissive as CC-BY 4.0. Since the Bee population licence is https://data.exampleMS.gov/resource/FreeAndOpen and it is an exact match with http://publications.europa.eu/resource/authority/licence/CC0, and this CC0 licence is more permissive than CC-BY-4.0, the HVD requirement is met.

    Because in producing the RDF representation additional provenance information is included that is sensitive, the publisher changes the licence for that distribution to a more restrictive one.

    Example 11 - Restricting licences

    Although this restricted licence https://data.exampleMS.gov/resource/NoCommercialUseWithFees does not meet the HVD requirements for the "bee population" dataset, the "bee population" dataset is still conformant to the HVD implementing regulation as the RDF distribution was not within scope of the HVD. The same reasoning holds for the "wasp population" dataset. This illustrates the flexibility the DCAT-AP HVD specification offers to address complex and rare scenarios data publishers might face.

    The Data Service exampleMS:EAMS-APIplatform provides access to both datasets. The legal conditions on the usage of the platform for the "bee population" dataset is a combination of the API platform conditions (e.g. no misuse by triggering DDOS activities, no sharing of access tokens to third parties, etc. ) and the dataset conditions. The API request https://orgea.exampleMS.gov/api/v2/beepopulation/ has thus different conditions than https://orgea.exampleMS.gov/api/v2/wasppopulation/. Therefore, the nature of the licence document, associated with a Data Service, is usually more oriented to the use of the API platform rather than to the use of the data it provides access too.

    In the example the 'Terms of Use' for the API platform are mentioned as the license. In addition, the API platform can also indicate the SLA it offers.

    Example 12 - Data service with terms and SLA

    Reporting

    The MS reports its HVD conformance status by providing a catalogue containing all metadata in scope of HVD. To facilitate the conformance assessment, it will only include the Datasets, Data Services and Distributions that are in scope of HVD. The catalogue will also contain any additional supportive information such as ContactPoints, Agents and the mapping for the licences to the EU Licences.

    Example 13 - MS Catalogue

    To reduce the risk of misinterpretation, the Catalogue Resource connecting properties such as dcat:servesDataset and dcat:distribution should be inspected to not refer to Catalogued Resources outside the scope of HVD. In the example below, the reference to the RDF distribution for the "bee population" and the "wasp population" dataset are removed from the reporting catalogue.

    Example 14 - MS Catalogue in HVD scope

    Based on this catalogue the MS can be audited for its conformance. During the assessment it might occur that the supplied information is not sufficient, and that the assesement must follow the references outside the supplied catalogue. E.g., when assessing the permissiveness of the licences the details of the referenced EU Licence must be consulted. Crossing these bounderaries is a regular occurnce and it can be done during the assessment without impacting the results when the supplied data is based on persistent identifiers (PURIs).

    The use of dereferenceable persistent identifiers could also lead to another agreemnent to supply a more condensed representation of the reporting catalogue. Under the condition that all catalogued resources in scope of HVD are in the Portal for European Data [[DEU]], then the MS could simply supply the reduced catalogue as:

    Example 15 - MS Catalogue reduced
    This illustrates that when the dataset publishers provide the necessary information and this is well integrated in the network of sharing metadata through the MS to the Portal for European Data, the data exhange for the reporting can be reduced to a minimum.

    Quick Reference of Classes and Properties

    This section provides a condensed tabular overview of the mentioned classes and properties in this specification. The properties are grouped under headings mandatory, recommended, optional and deprecated. These terms have the following meaning.
    ClassClass IRIProperty TypePropertyProperty IRI
    Agent
    http://xmlns.com/foaf/0.1/Agent
    Catalogue
    http://www.w3.org/ns/dcat#Catalog
    Mandatory applicable legislation
    http://data.europa.eu/r5r/applicableLegislation
    Catalogue
    http://www.w3.org/ns/dcat#Catalog
    Mandatory publisher
    http://purl.org/dc/terms/publisher
    Catalogue
    http://www.w3.org/ns/dcat#Catalog
    Recommended dataset
    http://www.w3.org/ns/dcat#dataset
    Catalogue
    http://www.w3.org/ns/dcat#Catalog
    Recommended licence
    http://purl.org/dc/terms/license
    Catalogue
    http://www.w3.org/ns/dcat#Catalog
    Recommended service
    http://www.w3.org/ns/dcat#service
    Catalogue
    http://www.w3.org/ns/dcat#Catalog
    Optional record
    http://www.w3.org/ns/dcat#record
    Catalogue Record
    http://www.w3.org/ns/dcat#CatalogRecord
    Mandatory primary topic
    http://xmlns.com/foaf/0.1/primaryTopic
    Catalogued Resource
    http://www.w3.org/ns/dcat#Resource
    Concept
    http://www.w3.org/2004/02/skos/core#Concept
    Data Service
    http://www.w3.org/ns/dcat#DataService
    Mandatory applicable legislation
    http://data.europa.eu/r5r/applicableLegislation
    Data Service
    http://www.w3.org/ns/dcat#DataService
    Mandatory contact point
    http://www.w3.org/ns/dcat#contactPoint
    Data Service
    http://www.w3.org/ns/dcat#DataService
    Mandatory endpoint URL
    http://www.w3.org/ns/dcat#endpointURL
    Data Service
    http://www.w3.org/ns/dcat#DataService
    Mandatory HVD category
    http://data.europa.eu/r5r/hvdCategory
    Data Service
    http://www.w3.org/ns/dcat#DataService
    Mandatory quality of Service
    http://xmlns.com/foaf/0.1/Page
    Data Service
    http://www.w3.org/ns/dcat#DataService
    Mandatory title
    http://purl.org/dc/terms/title
    Data Service
    http://www.w3.org/ns/dcat#DataService
    Recommended endpoint description
    http://www.w3.org/ns/dcat#endpointDescription
    Data Service
    http://www.w3.org/ns/dcat#DataService
    Recommended serves dataset
    http://www.w3.org/ns/dcat#servesDataset
    Data Service
    http://www.w3.org/ns/dcat#DataService
    Optional licence
    http://purl.org/dc/terms/license
    Data Service
    http://www.w3.org/ns/dcat#DataService
    Optional rights
    http://purl.org/dc/terms/rights
    Dataset
    http://www.w3.org/ns/dcat#Dataset
    Mandatory applicable legislation
    http://data.europa.eu/r5r/applicableLegislation
    Dataset
    http://www.w3.org/ns/dcat#Dataset
    Mandatory HVD Category
    http://data.europa.eu/r5r/hvdCategory
    Dataset
    http://www.w3.org/ns/dcat#Dataset
    Recommended contact point
    http://www.w3.org/ns/dcat#contactPoint
    Dataset
    http://www.w3.org/ns/dcat#Dataset
    Recommended dataset distribution
    http://www.w3.org/ns/dcat#distribution
    Dataset
    http://www.w3.org/ns/dcat#Dataset
    Recommended publisher
    http://purl.org/dc/terms/publisher
    Dataset
    http://www.w3.org/ns/dcat#Dataset
    Optional conforms to
    http://purl.org/dc/terms/conformsTo
    Dataset
    http://www.w3.org/ns/dcat#Dataset
    Optional other identifier
    http://www.w3.org/ns/adms#identifier
    Distribution
    http://www.w3.org/ns/dcat#Distribution
    Mandatory access URL
    http://www.w3.org/ns/dcat#accessURL
    Distribution
    http://www.w3.org/ns/dcat#Distribution
    Mandatory applicable legislation
    http://data.europa.eu/r5r/applicableLegislation
    Distribution
    http://www.w3.org/ns/dcat#Distribution
    Mandatory HVD Category
    http://data.europa.eu/r5r/hvdCategory
    Distribution
    http://www.w3.org/ns/dcat#Distribution
    Recommended licence
    http://purl.org/dc/terms/license
    Distribution
    http://www.w3.org/ns/dcat#Distribution
    Optional access service
    http://www.w3.org/ns/dcat#accessService
    Distribution
    http://www.w3.org/ns/dcat#Distribution
    Optional linked schemas
    http://purl.org/dc/terms/conformsTo
    Document
    http://xmlns.com/foaf/0.1/Document
    Identifier
    http://www.w3.org/ns/adms#identifier
    Kind
    http://www.w3.org/2006/vcard/ns#Kind
    Mandatory contact page
    http://www.w3.org/2006/vcard/ns#hasURL
    Kind
    http://www.w3.org/2006/vcard/ns#Kind
    Mandatory email
    http://www.w3.org/2006/vcard/ns#hasEmail
    Legal Resource
    http://data.europa.eu/eli/ontology#LegalResource
    Licence Document
    http://purl.org/dc/terms/LicenseDocument
    Literal
    http://www.w3.org/2000/01/rdf-schema#Literal
    Resource
    http://www.w3.org/2000/01/rdf-schema#Resource
    Rights statement
    http://purl.org/dc/terms/RightsStatement
    Standard
    http://purl.org/dc/terms/Standard

    Acknowledgments

    The editors gratefully acknowledge the contributions made to this document by all members of the working group. This work was elaborated by a Working Group under SEMIC by Interoperable Europe. Interoperable Europe of the European Commission was represented by Pavlina Fragkou and Seth Van Hooland. Natasa Sofou, Makx Dekkers and Bert Van Nuffelen were the editors of the specification. Past and current contributors are : Alberto Abella , Anssi Ahlberg , Adam Arndt , Judie Attard , Julius Belickas , Nick Berkvens , Konstantis Bogucarskis , Peter Bruhn Andersen , Ewa Bukala , Martin Böhm , Nikolai Bülow Tronche , Ana Cano , Eileen Carroll , Egle Cepaitiene , Luisa Cidoncha , Marco Combetto , John Cunningham , Jitse De Cock , Ine de Visser , Kelly Deirdre , Makx Dekkers , Radko Domanska , Iwona Domaszewska , Ulrika Domellöf Mattsson , Alessio Dragoni , Nicolai Draslov , Frederik Emanualsson , Jordi Escriu , Jose-Luis Fernandez-Villacanas , Nuno Freire , Leyre Garralda , Alma Gonzalez , William Verbeeck , Bart Hanssens , Kieran Harper , Jasper Heide , Mika Honkanen , Peter Isrealsson , Fabian Kirstein , Michal Kitta , Jakub Klimek , Rae Knowler , Fredrik Knutsson , Peter Kochman , Sirkku Kokkola , Michal Kuban , Michal Kuban , Kaia Kulla , Maria Lenartowicz , Anja Litka , Anja Loddenkemper , Hagar Lowenthal , Melanie Mageean , Agata Majchrowska , Hugh Mangan , Estelle Maudet , Balint Miklos , Esther Minguela , Joachim Nielandt , Geraldine Nolf , Erik Obsteiner , Javier Orozco , Casper le Gras , Csapo Orsolya , Matthias Palmer , Alberto Palomo , Francesco Paolicelli , Eirini Pappi , Mihai Paunescu , Sylwia Pichlak Pawlak , Jiri Pilar , Ludger Rinsche , Daniele Rizzi , Reet Roosalu , Ana Rosa , Maik Roth , Antonio Rotundo , Michal Ruzicka , Jill Saligoe-Simmel , Fabian Santi , Giovanna Scaglione , Giampaolo Selitto , Martin Semberger , Paulo Seromenho , Jan Skornsek , Michele Spichtig , Emidio Stani , Kjersti Steien , Simon Steuer , Terje Sylvarnes , Martin Traunmuller , Kees Trautwein , Stavros Tsouderos , Thomas Tursics , Bert Van Nuffelen , Uwe Voges , Gabriella Wiersma , Jesper Zedlitz , Mantas Zimnickas .