This document provides guidance for creating new DCAT-AP profiles. It outlines a structured approach to identify use cases, consider appropriate solutions, build consensus, and publish profiles that extend or specialize the DCAT Application Profile for data portals in Europe.

Introduction

SEMIC uses DCAT-AP as a master profile of DCAT, which means that all DCAT-AP profiles must comply with the requirements and restrictions imposed by DCAT-AP, namely mandatory properties and controlled vocabularies that must be used with select properties. The process of creating a new DCAT-AP profile therefore starts from a minimal DCAT-AP profile, gradually extending it to cover the requirements driving the creation of a new profile.

In the first part of this guide, we cover the process of creating a new DCAT-AP profile in 4 steps. In the second part, we cover the most common problematic situations in more detail and suggest approaches to deal with them.

Target audience

This how to guide is intended for policy officers leading a project which is considering the (re)use of DCAT-AP and their team members tasked with implementing the (re)use. This Guide describes the high-level process to inform a policy officer on the work to be undertaken and support his decision making. For the implementers of the new DCAT-AP profile, the guide provides details on how to concretely approach this task while remaining compatible with the wider DCAT-AP ecosystem.

How to create your DCAT-AP profile

Creating a new DCAT-AP profile is a systematic process that ensures compatibility with existing DCAT-AP implementations while addressing specific domain or organisational requirements. This section outlines the four essential steps that should be followed when developing a new profile.

Step 1 - Familiarise with DCAT-AP

DCAT-AP focuses on various aspects of dataset cataloguing such as basic dataset metadata, data services, dataset series, dataset versioning, geospatial aspects of datasets, relationships among datasets and agents, and more. These aspects are the ones collected from multiple domains and data catalogues. This does not mean that each data catalogue, and therefore each profile, needs to cover all these aspects. When creating a new DCAT-AP profile, we recommend starting with the minimal DCAT-AP compliance requirements, i.e., the mandatory classes and their mandatory properties for DCAT-AP, and therefore for every profile, identified in Provider requirements.

A minimum DCAT-AP conformant catalogue contains:

  1. Catalogue
    1. Title and description of the catalogue
    2. Publisher of the catalogue and their name
    3. Links to datasets in the catalogue (if any)
  2. For each Dataset (if any)
    1. Title and description of the dataset
    2. Links to distributions of the dataset (if any)
  3. For each Distribution of a dataset (if any)
    1. Access URL of the distribution
  4. For each Data Service (if any)
    1. Title of the data service
    2. Endpoint URL of the data service
DCAT-AP Minimal Profile

Before proceeding further, the use cases driving the creation of the new DCAT-AP profile need to be identified.

Step 2 - Consider your use cases and propose their resolution

Before extending the minimal DCAT-AP profile, the use cases targeted by the new DCAT-AP profile need to be described, so that approaches to their data representation can be designed. Examples of use cases:

For each use case, its data representation can be designed according to the following standard situations:

Use case covered by DCAT-AP

The part that covers the use case will be reused as-is in DCAT-AP. This includes class and property labels, property cardinalities, definitions, usage notes and mandatory code lists from DCAT-AP, and a link to the reused DCAT-AP class or property.

Simple example - the use case is covered by an existing DCAT-AP property
Use case
As a data publisher, I want to represent dataset distribution file format.
Existing DCAT-AP solution
The Distribution format property of the Distribution class, using the EU Vocabularies File Type Named Authority List for values.
Resolution
Reuse the Distribution format property as-is, including label, description, cardinality and possible usage note in the new profile and link to the DCAT-AP description in the HTML documentation of the profile, as in GeoDCAT-AP 3.0.0 Distribution format.
More complex example - the use case is covered by an existing class existing properties in DCAT-AP
Use case
As a data publisher, I want to represent a contact point for my datasets.
Existing DCAT-AP solution
The Dataset class connects through the contact point property to the Kind class. The Kind class has no further properties specified in DCAT-AP. However, following the vCard Ontology for representation of contact point information is prescribed.
Resolution
The contact point property and the Kind class will be reused as-is. This includes their labels, definitions, usage notes and property cardinalities. Since DCAT-AP does not prescribe specific properties to be used for contact point, in this reuse as-is scenario, the same situation applies to the DCAT-AP profile. If specific properties for contact point would be required, see the next case, where context specific adjustments are required to continue.

Use case covered by DCAT-AP with context-specific adjustments required

For example, in Czechia, every distribution needs to have a licence. This is not required by DCAT-AP. It is however an adjustment that tightens existing constrains. Specifically, the cardinality of Distribution licence in DCAT-AP is 0..1, and in the new profile, it will be 1..1. However, if the Czech legislation mandated that there need to be multiple licences attached, meaning cardinality 1..*, this would not be compliant with DCAT-AP Distribution licence and another property would have to be used for this (in that case, skip to the next situations).

Generally speaking, these adjustments are possible, if they tighten existing constraints. It is not possible to loosen them under any conditions, as the result would no longer be compliant with DCAT-AP. For example:

Use case
As a publisher, for an HVD dataset, I require applicable legislation to point to a specific legislation IRI.
DCAT-AP solution
A Dataset can point to any applicable legislation identified by an IRI.
Resolution
Restriction of an existing DCAT-AP property in the new profile, as in DCAT-AP HVD Dataset applicable legislation, where a specific usage note is added, and the cardinality constraint is tightened from 0..* to 1..*, which is a compliant change. It would also be possible to make the terminology more context specific, e.g. rename the Dataset class to "HVD Dataset", which was not deemed necessary here.

Use case not covered by DCAT-AP, but covered by W3C DCAT or another vocabulary or DCAT-AP profile

Then reuse that approach, given that it does not cause a semantic issue. For example:

Use case
As a data consumer, I want to know the language of a Data Service
DCAT-AP solution
DCAT-AP does not have a solution
DCAT solution
DCAT specifies the language property for the class Catalogued Resource, this can be reused, as Data Service is a subclass of Catalogued Resource, so no semantic issue arises.
Resolution
Reuse property from DCAT in the new DCAT-AP application profile and link to the property description in DCAT, as in GeoDCAT-AP 3.0.0 Data Service language.

The use case is not covered by any known or admissible vocabulary to be reused

Then an original approach needs to be developed, typically supported by a new vocabulary defining new classes and properties.

For example, in Czechia, the local copyright legislation requires a finer description of a Distribution licence than what is supported in DCAT-AP, where these are just IRIs with the optional Licence types. Therefore, additional new properties based on the Czech legislation needed to be defined for Licence document and added to DCAT-AP-CZ.

Another example:

Use case
As a data publisher, I want to annotate Data Service with their category.

Approach:

  • Since no appropriate property was found, we create our own in our own namespace, e.g. http://data.europa.eu/930/serviceCategory along with its label, definition and possible domain and range.
  • Note that in this case, the property is used with a code list that is already published as linked data, which is a prerequisite for code lists to be used with DCAT-AP profiles.
  • Use it in the new DCAT-AP profile like GeoDCAT-AP 3.0.0 Service category.

Less frequent, or more complex situations

Less frequent, or more complex situations are discussed in a separate section: Specific situations and how to deal with them.

When in doubt about a use case resolution, you may reach out to the DCAT-AP community using GitHub. It may be that your use case is addressed by other DCAT-AP profiles or in other contexts known by the DCAT-AP community, and thus that you can reuse their approach.

When multiple profiles address the same use case, the DCAT-AP community may initiate a process for a cross-domain common DCAT-AP approach to streamline the different profile variations. For that to happen, the community needs to be aware of existing and emerging DCAT-AP profiles. We kindly request that newly created DCAT-AP profiles are announced as a DCAT-AP GitHub issue.

Step 3 - Involve relevant stakeholders to build a consensus about the new profile

When creating a new DCAT-AP profile, it is crucial to get all stakeholders affected by the profile to the table as early as possible in the profile development process or at least to make sure that they are undeniably given the opportunity to join the process. This is to avoid problems with non-acceptance of the developed profile by the stakeholders, or complaints by the stakeholders about not having the opportunity to influence the profile and therefore having problems implementing it later.

Furthermore, to avoid later discussions about opinions not considered etc., the process of consensus building needs to be transparent, accessible and traceable. This can be solved by utilising a public repository for the new profile, e.g. GitHub or a public GitLab instance.

Each use case from the previous step should be clearly identified e.g. as a GitHub issue. For an example, see e.g. Issue #73 of GeoDCAT-AP. Its description should contain a problem statement or requirement, and a proposed resolution, or its variants, based on the situations described in this document.

The new profile community should be asked to consider the use case and express agreement or disagreement, and possibly an alternative to the proposed resolution. A consensus about the resolution of the individual use cases should be reached, giving credibility to the resolutions and the new DCAT-AP profile.

If there is insufficient activity around the profile, or discussions about some use cases do not lead to a consensus, webinars with the profile community, where interested parties may discuss alternatives and vote on a resolution in a more interactive way, should be organised. For each webinar, a set of issues to be discussed should be identified and sent in advance with the invitation to that webinar. The reached resolutions then need to be recorded with the issues in the repository.

An example of such a process is the GeoDCAT-AP 3.0.0 revision, documented in the specification.

Step 4 - Publish the new DCAT-AP profile

It is important that the new DCAT-AP profile is published on the Web to avoid problems with its findability and accessibility. Artifacts of the profile such as the human readable documentation, RDFS vocabulary with definitions of new classes and properties, or SHACL shapes validating defined constraints should be published at publicly accessible URLs. As these URLs need to be as stable as possible not to break references in future, a conscious effort should be invested in their definition. See 10 Rules for Persistent URIs for inspiration. This may also include usage of Persistent URL services such as w3id.org. As a free web-hosting service, e.g. GitHub Pages can be used. The following steps should be taken:

  1. Publish the new DCAT-AP profile.
    1. A human-readable, HTML-based specification of the application profile. As a specification authoring tool, Respec or Bikeshed can be used. It should contain the profile usage context, explanation how the profile is distinct from DCAT-AP, and contact information, e.g. a link to a GitHub repository, where feedback can be collected.
    2. A machine-actionable set of validation rules using SHACL shapes.
  2. If new classes and properties were created, publish the newly created vocabulary.
    1. A separate human-readable document describing the newly created classes and properties. As a specification authoring tool, Respec or Bikeshed can be used. It should contain a link to the profile it supports, and contact information, e.g. a link to a GitHub repository, where feedback can be collected.
    2. RDF Schema (or OWL) based definitions of the newly created classes and properties as an RDF file.

The SEMIC team can always be contacted for publication advice (tooling, modelling advice, publishing advice).

Making the profile publicly accessible serves not only the intended audience, but also the broader community, increasing cross fertilisation, i.e. avoiding situations where various profiles address similar use cases in different ways, making them less interoperable, and the overall profile ecosystem more complex.

Specific situations and how to deal with them

In this part of the guide, we focus on less frequent and more complex situations that can be encountered when creating a new DCAT-AP profile that are not covered by the standard ones described above. When in doubt about a use case resolution, you may reach out to the DCAT-AP community using GitHub. It may be that your use case is addressed by other DCAT-AP profiles or in other contexts known by the DCAT-AP community, and thus that you can reuse their approach.

Expectations on Linked Data, Dereferenceability and URI behaviour

Problem sketch

DCAT-AP is a Semantic Web data specification. Despite that the specification does not enforce the data exchange in implementation to be in native Linked Data formats (RDF serialisations), this is implicitly expected by many editors or contributors.

Therefore, the profile should not block the implementation as Linked Data exchange. Consult also the section on identifiers in DCAT-AP.

What to do

  1. The DCAT-AP profile is a semantic web data specification:
    1. All new properties should be defined according to the Linked Data principles of persistent URIs.
    2. The profile specification should be a web enabled HTML specification
  2. The DCAT-AP profile data exchange may be done as Linked Data.
    1. All instances of the mentioned classes may have a URI as identifier. Ensure that this URI is not conflicting with the use of any identifier property (e.g. dct:identifier, adms:identifier).

Persistent URIs are

  • Dereferenceable with a minimum content negotiation implemented:
    • If the Accept Header is text/html then return the human oriented HTML representation
    • If the Accept Header is an RDF serialisation then return the corresponding RDF representation

Explanation

Not following these standards will cause systems that rely on them to not be able to process your data. If dereferenceability is not in place systems and users will not be able to understand new elements nor understand how they are to be used. Elements which are not dereferenceable in the best case are disregarded or in the worst case the entire dataset might be thrown out. By not meeting the expectations a new barrier to interoperability is created undoing the efforts made on the reuse of DCAT-AP.

Existing DCAT-AP property/class does not semantically fit the use case

Problem

For a given use case, there is a property/class in DCAT-AP that seems appropriate at first glance, e.g. based on its IRI or label, etc. However, upon closer examination, application of the property/class does not fit exactly its definition or usage note in DCAT-AP.

What to do

  1. If the description seems formulated vaguely in DCAT-AP, and a more fitting description is required, that does not contradict the DCAT-AP formulations and only makes them more specific to the context of the new profile, then reuse is still possible.
    1. If the changes are simply terminological, but otherwise it is OK to treat the property/class as it is defined in DCAT-AP, the more precise labels can be applied to the HTML documentation. For more information, see the SEMIC Style Guide.
      • Example: DCAT-AP has applicable legislation property, but in the new profile, the authors would like to call it directives, as this label better fits their context. Other than that, the values are The legislation that mandates the creation or management of the Dataset. as stated in the definition. It is fine if outside of the profile context, the values are labelled as applicable legislation.
    2. If the changes are more substantial, and there is need to differentiate between the profile class/property and other DCAT-AP usages of the class/property, then a subclass or subproperty needs to be created in a new vocabulary and used in the profile. For more information, see the SEMIC Style Guide.
      • Example: DCAT-AP has applicable legislation property, but in the new profile, it is important to distinguish applicable internal legislation and applicable national legislation. It is fine if both are interpreted as applicable legislation outside of the profile context, but in the profile, each must be validated separately, as the distinction is significant.
  2. If the required description has significant differences, and interpretation of the data using only the DCAT-AP description does not fit, then the class/property cannot be reused, as it would cause confusion with DCAT-AP users. A new class/property needs to be defined and used.
    1. Example: DCAT-AP has applicable legislation property. In the new profile, the authors want to represent a relation between a Dataset and internal legislation, in which the dataset is referred to. In this case, the property cannot be reused, because the definition says "The legislation that mandates the creation or management of the Dataset.", not legislation mentioning the dataset. Therefore, the reuse of the property in this context would be confusing from DCAT-AP point of view.

Explanation

When reusing existing classes and properties, profile authors must make sure that the labels, definitions and usage notes in DCAT-AP still capture the meaning of the data, even if in a more abstract way, and that there are no semantic differences or contradictions. This is because the users outside of the profile context will interpret the data using the DCAT-AP interpretation, which must not be confusing.

Controlled vocabulary attached to a DCAT-AP property does not exactly fit

Problem sketch

A reused DCAT-AP property for a certain class fits a use case of the new profile. However, there is a controlled vocabulary attached to the property for this class in DCAT-AP, and it is not fit for purpose.

What to do

  • Some items that are required in the profile are missing in the controlled vocabulary
    • Solution 1: Try to extend the existing controlled vocabulary at its source. Many of the specified controlled vocabularies are managed in EU Vocabularies, and there is a possibility to submit additional items in them. Similarly, this may be possible with other controlled vocabulary publishers.
    • Solution 2: If the above solution is not feasible, create a separate controlled vocabulary, and decide based on the constraints on the controlled vocabulary usage on the considered property.
      • The property MUST use the original controlled vocabulary in DCAT-AP
        1. Create a separate property
        2. Establish mapping between the overlapping items and consider generating the DCAT-AP compliant usage in addition to the profile specific one automatically.
      • The property MUST have at least one value from the original controlled vocabulary in DCAT-AP
        1. Use at least one value from the original controlled vocabulary with the original property
        2. For additional values of the original property, use the different controlled vocabulary.
        3. Consider establishing mapping with the original controlled vocabulary.
      • The property IS RECOMMENDED to use as range values codes from the original controlled vocabulary in DCAT-AP
        1. Use the separate controlled vocabulary with the original property
        2. Consider, if, in addition to the values from the separate controlled vocabulary, values from the original one can be used.
      • The property MAY use as range values codes from the original controlled vocabulary in DCAT-AP
        1. Use the separate controlled vocabulary with the original property
  • A different controlled vocabulary is to be used

Explanation

In DCAT-AP, the expectations of controlled vocabularies that MUST be used with listed properties of specified classes are that if the property is to be used with that class, even in DCAT-AP profile, it is to be used with the DCAT-AP indicated controlled vocabulary. This applies unless otherwise specified in a usage note, e.g. in the case of the property dct:publisher and the EU Vocabularies Corporate bodies Named Authority List.

When, for a certain property used with a class, values from a controlled vocabulary MAY be used or ARE RECOMMENDED to be used, alternative controlled vocabularies are allowed or tolerated instead of values from the original one. When the property MUST have at least one value from a controlled vocabulary, values from other controlled vocabularies are allowed only in addition to the specified one.

An existing DCAT-AP property description fits the addressed use case, but its range does not

Problem sketch

An existing DCAT-AP property fits the use case of the new profile we want to address. However, a different range is required. For example:

Use case
As a data publisher, I want to represent the spatial resolution of a Dataset in arbitrary units of measurement.
DCAT-AP situation
There is a property Dataset spatial resolution labelled "spatial resolution", with the definition "The minimum spatial separation resolvable in a dataset, measured in meters." and the range of xsd:decimal.
Problem
I want to be able to use arbitrary units of measurement, not just meters. This means that the range would be a class instead of a literal representing the number of meters.
Solution
A different property needs to be used, as reuse of the original property in this case would not be compatible with DCAT-AP. See the GeoDCAT-AP 3.0 section on the three different ways of representing spatial resolution.

There can be several types of range incompatibilities, and what can or cannot be done in each case depends on the exact nature of the incompatibility resulting from the combination of expected range types and the ones defined in DCAT-AP.

What to do

  1. The range in DCAT-AP is a class, the profile requires a literal datatype
    1. Reuse not possible. Another property needs to be used.
      • Example: In DCAT-AP we have a property Dataset contact point with the class vcard:Kind as range. The profile requires the value of contact point to be text, such as telephone: +420111222333, fax: 55448889. This is not possible; another property needs to be used.
  2. The range in DCAT-AP is a literal datatype, the profile requires a class
    1. Reuse not possible. Another property needs to be used.
      • Example: the introductory example above.
  3. The range in DCAT-AP is a literal datatype, the profile requires another literal datatype
    1. Reuse possible, if the profile datatype is a subtype of the DCAT-AP one.
      • Example: DCAT-AP Dataset Release Date is a Temporal Literal, which means that xsd:gYear, xsd:gYearMonth, xsd:date or xsd:dateTime is allowed.
        1. A profile may require that only xsd:date is used, which is OK.
        2. Another profile may require that also xsd:gDay can be used. This is, however, not possible, as that datatype is not expected by existing DCAT-AP implementations.
  4. The range in DCAT-AP is a class, the profile requires controlled vocabulary usage
    1. This means a selection from specific instances of the DCAT-AP indicated class, listed in a controlled vocabulary. This is OK.
      • Example: in DCAT-AP, we have a property Dataset contact point with the class vcard:Kind as range, meaning any resource can be used. The profile has a controlled vocabulary of predefined contact points to be used. This is OK.
      • Example: in DCAT-AP, dcterms:conformsTo is used to indicate An implementing rule or other specification of a dataset. This can be any standard. The profile is domain-specific, where only a limited set of standards is admissible. This set will be represented as a profile-specific controlled vocabulary, which is OK.
  5. The range in DCAT-AP is one class, the profile requires another class
    1. If the classes are semantically compatible, then it is OK. For interoperability purposes, it might be good to double type the resources so that even applications not supporting inference are able to process the data according to DCAT-AP.
      • Example: DCAT-AP has applicable legislation property, with range eli:LegalResource. In a profile, instances of another class, skos:Concept are used as values of the property. The effect is that the instances will be treated also as instances of eli:LegalResource in DCAT-AP implementations.
        1. If the definition of eli:LegalResource "This class represents the legislation, policy or policies that lie behind the Rules that govern the service." does not conflict with the instances of skos:Concept used in the profile, this is OK.
        2. However, if some of the instances used do not fit the definition of eli:LegalResource, then this is not compliant with DCAT-AP.

Explanation

Existing DCAT-AP implementations expect value types (datatype, class instance, controlled vocabulary) specified for existing DCAT-AP properties. Narrowing down the value space will not create any issues but extending it will.

You want to add a new property, to which class it should be assigned?

Problem sketch

When the use case resolution requires definition of a new property, its domain and range should be defined as well. If this property is to be used in multiple contexts, the question is, what should be its range.

DCAT-AP applies the guideline to use properties at their most concrete usage context. For example, the applicable legislation of a Dataset is used at the level of a Dataset with the definition The legislation that mandates the creation or management of the Dataset. Then, it is used again, at the level of a Data Service, as The legislation that mandates the creation or management of the Data Service, reflecting the specifics of the context given by the class on which it is used. However, both usages are of the same property r5r:applicableLegislation. This property is defined separately, in the DCAT-AP Vocabulary, with the more generic definition the legislation that is applicable to this resource and the generic domain rdfs:Resource. This is done so that the property itself is reusable in multiple contexts. Then, its reuse can be further described in the specific context such as Dataset or Data Service.

Another example can be seen in DCAT-AP Issue #384 for the property spatial resolution, where one property is used in three different contexts, differently each time.

The same principle is suggested for properties define in DCAT-AP profiles, i.e. a reusable definition of the property in a vocabulary, and then description of its context-specific usage in the profile documentation.

What to do

  1. Define the new property in a supporting vocabulary. This consists of
    1. Create if it does not exist yet, or otherwise use a supporting vocabulary in a managed namespace, see DCAT-AP Vocabulary for the http://data.europa.eu/r5r namespace as an example.
    2. Add the property to the supporting vocabulary.
    3. Ensure the domain is as open as possible: use dcat:Resource for catalogable resources, or rdfs:Resource for generic resources
    4. Ensure the range is as open as possible:
      1. For object properties use rdfs:Resource, or a very generic class
      2. For datatype properties use rdfs:Literal, or rdf:langString (to ensure multilinguality is always supported)
    5. Provide the label, definition and usage note. Ensure coherency with the formal specification. Most importantly replace the local class context with the corresponding generic context.
      1. For example, do not write as definition: status of the Dataset when the domain is dcat:Resource. Write status of the catalogued resource instead.
  2. Ensure that this supporting vocabulary is correctly published in HTML and RDF.
  3. Apply the newly created property in the DCAT-AP profile.

Explanation

This application is not different from the usage of any other property already present in DCAT-AP. Use the same reuse steps as described in previous situations to document the usage as precise as possible.

Considering the DCAT-AP profile and the supporting vocabulary as two distinct specifications, each with their own lifecycle, removes the design conflict (detailed usage context versus generic reusable context) profile builders experience.

DCAT-AP property fits a use case, except for its cardinality

Problem sketch

A DCAT-AP property has a cardinality different than required by the new profile.

What to do

  1. DCAT-AP is more restrictive than required
    1. In DCAT-AP, the property is mandatory (1..1, 1..*)
      1. Then it must be mandatory in the profile as well.
    2. In DCAT-AP, the property can have at most one value (1..1, 0..1)
      1. Then in the profile, the property also must have at most one value.
      2. If there is a good case to change this on DCAT-AP level, raise an issue.
  2. DCAT-AP is less restrictive than required
    1. No action required, a profile can be more restrictive than DCAT-AP.

Explanation

Existing DCAT-AP implementations expect values for mandatory properties and treat properties with number of values limited to 1 as such. Applications might implement simple values as simple attributes, not expect arrays in code. SPARQL queries may not aggregate over variables which are expected to have at most one value or may expect values for mandatory properties. These may break if the constraints are violated.

You want to reuse a DCAT-AP class but need to change the existing usage scope or definitions

Problem sketch

A new DCAT-AP profile describes the use of DCAT-AP in a new usage context. The generic DCAT-AP descriptions for the classes, such as Catalogue, Dataset, Distribution, ... may need to be contextualised for the profile to be supportive for the profile audience.

Example: In the new profile context, the audience calls Data Interface what in DCAT-AP is called a Distribution. In the new profile, the class dcat:Distribution will therefore have the Data Interface label, which will be used throughout the profile document.

Another related challenge may occur influencing the classes' usage notes or definitions. The profile's usage context applies to the whole document. It, however, may happen that the profile combines several distinct usage contexts together.

Example: I want to have a different set of mandatory properties for Distribution when it represents a downloadable file, and a different set of mandatory properties for Distribution when it represents a data service. Then in the usage notes of all properties, I need to repeat sentences like If the Distribution represents a Data Service, this is mandatory. If the Distribution represents a downloadable file, this is not mandatory.

Without taking any measures, integrating distinct usage contexts into one class leads to more complex and longer usage descriptions. And, therefore, increasing the risk for misinterpretations.

What to do

  1. Describe the usage context in the introductory sections of the DCAT-AP profile. A profile without a well circumscribed context is subject to the existential question: what does this profile address differently than DCAT-AP?
    • Example 1: State This profile describes Datasets coming from the Zoological domain. - then it is clear that what is written in this profile does not apply elsewhere.
    • Example 2: See the DCAT-AP HVD Context
  2. (where necessary) Improve the usage notes of the reused DCAT-AP classes to provide the readers with an as much precise context as possible to understand the difference between an instance for a class in DCAT-AP and an instance of the corresponding class in the new DCAT-AP profile.
  3. If there are multiple separate contexts of the same reused class e.g. as with the Distributions example above, create a subclass of the corresponding DCAT-AP class for each of the contexts. Apply the same publication rules as for new properties (as in ) and introduce the new class in the supporting vocabulary.
  4. If the usage contexts supersede one class, creating an additional, separate sub-profile for the context is preferred, containing all the affected classes and properties in that context.
    1. Example: DCAT-AP HVD defines a specific usage context spanning Datasets, Data Services and Distributions. That is why it is a separate profile (annex) of DCAT-AP.

Explanation

Each profile must express its motivation to exist. That scope reflects into the usage scoping of the classes. When a profile combines multiple usage scopes in one document, profile editors must either generalise the usage scopes to fit all cases, either create subclasses or sub profiles to document the requirements for each usage scope.

Reasoning example: If one would merge DCAT-AP with its annex DCAT-AP HVD into one document having one formal representation of the two usage contexts (outside of HVD and within HVD) then one can either:

  1. Write in the usage notes of each property: If this property is used to describe a High Value Dataset then the following additional guidelines hold ...
    1. If that applies to a single property in one class, then it may be acceptable.
    2. But in this case, it would apply to many properties, exploding the size of the formal description, and thus creating a burden for the reader to apply the guidelines depending on the usage context correctly.
  2. Create an HVD Dataset subclass of a DCAT-AP Dataset and associating all HVD requirements solely with that subclass, the audience reading the single document having interest in the HVD context will find more condensed descriptions of the implementation requirements.

However, if the usage context supersedes a single class, and requires more support, then it is advised to create an additional, separate sub-profile of the currently created profile, describing that usage context. This is the motivation for creating the DCAT-AP HVD as annex (profile) of DCAT-AP, as there are specific HVD related requirements not only for Datasets, but also for Data Services and Distributions.

You need to reuse an external vocabulary to describe a related resource. How to express the expectations of that related resource?

Problem sketch

Related issue https://github.com/SEMICeu/GeoDCAT-AP/issues/111.

There is a use case that can be addressed by reusing an external specification to describe a related resource. The specification already contains extensive documentation on how to describe that resource. Should this description be replicated in the new profile, or is linking to it enough?

Example: In DCAT-AP, a standard, to which a Dataset conforms, is linked using Dataset conforms to property, which has dcterms:Standard as a range.

Conforms to property of a Dataset in DCAT-AP

The class Standard is then described separately, but no specific properties are included in the DCAT-AP specification for this class.

Standard class in DCAT-AP

There are no specific requirements on how the linked standards should be described in DCAT-AP.

On the other hand, in GeoDCAT-AP, the class Standard has specific properties from Dublin Core and other vocabularies listed, specifying expectations that GeoDCAT-AP applications might have on Standards used in the GeoDCAT-AP context.

Standard class in GeoDCAT-AP

The question is whether these properties should be present in GeoDCAT-AP metadata records (e.g. in a catalogue) for every standard linked, or whether it is expected that a repository of Standards, which may exist independently of the catalogue, will use these properties.

# GeoDCAT-AP record
<#service> a dcat:DataService ;
  dct:conformsTo <http://www.opengis.net/def/crs/OGC/1.3/CRS84> .

# Should the following be in the GeoDCAT-AP record, or in a separate repository of standards?
<http://www.opengis.net/def/crs/OGC/1.3/CRS84> a dct:Standard ;
    dct:identifier "urn:ogc:def:crs:OGC:1.3:CRS84"^^xsd:anyURI ;
    dct:title "CRS84"@en ;
    dct:type <http://inspire.ec.europa.eu/glossary/SpatialReferenceSystem> ;
    skos:inScheme <http://www.opengis.net/def/crs/OGC> ;
    skos:prefLabel "CRS84"@en .

An orthogonal problem is whether these expectations are influenced by the Standard having an IRI or not, and that IRI being or not being dereferenceable.

# GeoDCAT-AP record, standard without IRI
<#service> a dcat:DataService ;
  dct:conformsTo [ a dct:Standard ;
    dct:identifier "urn:ogc:def:crs:OGC:1.3:CRS84"^^xsd:anyURI ;
    dct:title "CRS84"@en ;
    dct:type <http://inspire.ec.europa.eu/glossary/SpatialReferenceSystem> ;
    skos:inScheme <http://www.opengis.net/def/crs/OGC> ;
    skos:prefLabel "CRS84"@en .
] .

Standards with non-dereferenceable IRIs or standards not identified by IRIs might have to be described by the properties, if they are to be identifiable.

Both situations also relate to how the profile owner positions the data specification to its audience: as a specification in the landscape of other profiles or as an implementation guideline. In the first perspective it is natural to reference to other specifications for the semantics and guidelines, while the implementation perspective draws in decisions based on the interpretation of these referenced specifications. That eases the work for implementers but makes the profile more dependent on an implementation perspective.

What to do

DCAT-AP profiles should be designed as much as possible from a generic profile perspective, staying close to the core classes of DCAT-AP. This approach is about defining the common ground for both perspectives. From this core, an extension towards implementation support can be made easily.

For the following steps we assume object properties.

  1. Set the range for the property to the most appropriate generic class.
    1. The referenced description of the range is sufficient, or only requires a minimum implementation guidelines:
      • do not add any additional formal constraint to the range.
    2. The range needs more elaboration:
      • Data exchanges prefer the use of URI as values, e.g., the value of a licence is preferably a URI
        1. Reconsider adding additional requirements in the formal specification.
        2. If needed, add a separate section to handle the case when the shared value is not a URI.
      • Additional properties are necessary, even if the values are to be shared as a URI
        1. Consider the impact the property may impose on the management of instances of the range class. For instance, if one requires a title for a licence, then the profile imposes that all licences must have a title. As DCAT-AP profiles overlap, this requirement is not only for the data exchange in the context of the profile but also will impact the work in the context of other profiles, when used in combination.
        2. Introduce the property presence requirement in the most generic way possible, to minimise the cross-profile interference.

Explanation

By focussing on the needs for your use case rather than including all the original information of a reused asset: your users can focus on the essentials, the readability of the specification is safe guarded, and it eases the burden on creating interoperability.

Summary

In this section, we have covered the following less frequent, more complex situations, in addition to the common ones in :