Copyright © 2025
This document provides guidance for creating new DCAT-AP profiles. It outlines a structured approach to identify use cases, consider appropriate solutions, build consensus, and publish profiles that extend or specialize the DCAT Application Profile for data portals in Europe.
SEMIC uses DCAT-AP as a master profile of DCAT, which means that all DCAT-AP profiles must comply with the requirements and restrictions imposed by DCAT-AP, namely mandatory properties and controlled vocabularies that must be used with select properties. The process of creating a new DCAT-AP profile therefore starts from a minimal DCAT-AP profile, gradually extending it to cover the requirements driving the creation of a new profile.
In the first part of this guide, we cover the process of creating a new DCAT-AP profile in 4 steps. In the second part, we cover the most common problematic situations in more detail and suggest approaches to deal with them.
This how to guide is intended for policy officers leading a project which is considering the (re)use of DCAT-AP and their team members tasked with implementing the (re)use. This Guide describes the high-level process to inform a policy officer on the work to be undertaken and support his decision making. For the implementers of the new DCAT-AP profile, the guide provides details on how to concretely approach this task while remaining compatible with the wider DCAT-AP ecosystem.
Creating a new DCAT-AP profile is a systematic process that ensures compatibility with existing DCAT-AP implementations while addressing specific domain or organisational requirements. This section outlines the four essential steps that should be followed when developing a new profile.
DCAT-AP focuses on various aspects of dataset cataloguing such as basic dataset metadata, data services, dataset series, dataset versioning, geospatial aspects of datasets, relationships among datasets and agents, and more. These aspects are the ones collected from multiple domains and data catalogues. This does not mean that each data catalogue, and therefore each profile, needs to cover all these aspects. When creating a new DCAT-AP profile, we recommend starting with the minimal DCAT-AP compliance requirements, i.e., the mandatory classes and their mandatory properties for DCAT-AP, and therefore for every profile, identified in Provider requirements.
A minimum DCAT-AP conformant catalogue contains:
Before proceeding further, the use cases driving the creation of the new DCAT-AP profile need to be identified.
Before extending the minimal DCAT-AP profile, the use cases targeted by the new DCAT-AP profile need to be described, so that approaches to their data representation can be designed. Examples of use cases:
For each use case, its data representation can be designed according to the following standard situations:
The part that covers the use case will be reused as-is in DCAT-AP. This includes class and property labels, property cardinalities, definitions, usage notes and mandatory code lists from DCAT-AP, and a link to the reused DCAT-AP class or property.
For example, in Czechia, every distribution needs to have a licence. This is not required by
DCAT-AP.
It is however an adjustment that tightens existing constrains. Specifically, the cardinality of
Distribution
licence
in DCAT-AP is 0..1, and in the new profile, it will be 1..1. However, if
the Czech
legislation mandated that there need to be multiple licences attached, meaning cardinality
1..*,
this would not be compliant with DCAT-AP Distribution
licence
and another property would have to be used for this (in that case, skip to the next situations).
Generally speaking, these adjustments are possible, if they tighten existing constraints. It is not possible to loosen them under any conditions, as the result would no longer be compliant with DCAT-AP. For example:
0..* to 1..*,
which is a compliant change. It would also be possible to make the terminology more context
specific,
e.g. rename the Dataset class to "HVD Dataset", which was not deemed necessary here.
Then reuse that approach, given that it does not cause a semantic issue. For example:
Then an original approach needs to be developed, typically supported by a new vocabulary defining new classes and properties.
For example, in Czechia, the local copyright legislation requires a finer description of a Distribution licence than what is supported in DCAT-AP, where these are just IRIs with the optional Licence types. Therefore, additional new properties based on the Czech legislation needed to be defined for Licence document and added to DCAT-AP-CZ.
Another example:
Approach:
http://data.europa.eu/930/serviceCategory
along with its label, definition and possible domain and range.
Less frequent, or more complex situations are discussed in a separate section: Specific situations and how to deal with them.
When in doubt about a use case resolution, you may reach out to the DCAT-AP community using GitHub. It may be that your use case is addressed by other DCAT-AP profiles or in other contexts known by the DCAT-AP community, and thus that you can reuse their approach.
When multiple profiles address the same use case, the DCAT-AP community may initiate a process for a cross-domain common DCAT-AP approach to streamline the different profile variations. For that to happen, the community needs to be aware of existing and emerging DCAT-AP profiles. We kindly request that newly created DCAT-AP profiles are announced as a DCAT-AP GitHub issue.
When creating a new DCAT-AP profile, it is crucial to get all stakeholders affected by the profile to the table as early as possible in the profile development process or at least to make sure that they are undeniably given the opportunity to join the process. This is to avoid problems with non-acceptance of the developed profile by the stakeholders, or complaints by the stakeholders about not having the opportunity to influence the profile and therefore having problems implementing it later.
Furthermore, to avoid later discussions about opinions not considered etc., the process of consensus building needs to be transparent, accessible and traceable. This can be solved by utilising a public repository for the new profile, e.g. GitHub or a public GitLab instance.
Each use case from the previous step should be clearly identified e.g. as a GitHub issue. For an example, see e.g. Issue #73 of GeoDCAT-AP. Its description should contain a problem statement or requirement, and a proposed resolution, or its variants, based on the situations described in this document.
The new profile community should be asked to consider the use case and express agreement or disagreement, and possibly an alternative to the proposed resolution. A consensus about the resolution of the individual use cases should be reached, giving credibility to the resolutions and the new DCAT-AP profile.
If there is insufficient activity around the profile, or discussions about some use cases do not lead to a consensus, webinars with the profile community, where interested parties may discuss alternatives and vote on a resolution in a more interactive way, should be organised. For each webinar, a set of issues to be discussed should be identified and sent in advance with the invitation to that webinar. The reached resolutions then need to be recorded with the issues in the repository.
An example of such a process is the GeoDCAT-AP 3.0.0 revision, documented in the specification.
It is important that the new DCAT-AP profile is published on the Web to avoid problems with its findability and accessibility. Artifacts of the profile such as the human readable documentation, RDFS vocabulary with definitions of new classes and properties, or SHACL shapes validating defined constraints should be published at publicly accessible URLs. As these URLs need to be as stable as possible not to break references in future, a conscious effort should be invested in their definition. See 10 Rules for Persistent URIs for inspiration. This may also include usage of Persistent URL services such as w3id.org. As a free web-hosting service, e.g. GitHub Pages can be used. The following steps should be taken:
The SEMIC team can always be contacted for publication advice (tooling, modelling advice, publishing advice).
Making the profile publicly accessible serves not only the intended audience, but also the broader community, increasing cross fertilisation, i.e. avoiding situations where various profiles address similar use cases in different ways, making them less interoperable, and the overall profile ecosystem more complex.
In this part of the guide, we focus on less frequent and more complex situations that can be encountered when creating a new DCAT-AP profile that are not covered by the standard ones described above. When in doubt about a use case resolution, you may reach out to the DCAT-AP community using GitHub. It may be that your use case is addressed by other DCAT-AP profiles or in other contexts known by the DCAT-AP community, and thus that you can reuse their approach.
DCAT-AP is a Semantic Web data specification. Despite that the specification does not enforce the data exchange in implementation to be in native Linked Data formats (RDF serialisations), this is implicitly expected by many editors or contributors.
Therefore, the profile should not block the implementation as Linked Data exchange. Consult also the section on identifiers in DCAT-AP.
dct:identifier,
adms:identifier).Persistent URIs are
Accept Header is text/html then return the human oriented HTML representationAccept Header is an RDF serialisation then return the corresponding RDF
representationNot following these standards will cause systems that rely on them to not be able to process your data. If dereferenceability is not in place systems and users will not be able to understand new elements nor understand how they are to be used. Elements which are not dereferenceable in the best case are disregarded or in the worst case the entire dataset might be thrown out. By not meeting the expectations a new barrier to interoperability is created undoing the efforts made on the reuse of DCAT-AP.
For a given use case, there is a property/class in DCAT-AP that seems appropriate at first glance, e.g. based on its IRI or label, etc. However, upon closer examination, application of the property/class does not fit exactly its definition or usage note in DCAT-AP.
When reusing existing classes and properties, profile authors must make sure that the labels, definitions and usage notes in DCAT-AP still capture the meaning of the data, even if in a more abstract way, and that there are no semantic differences or contradictions. This is because the users outside of the profile context will interpret the data using the DCAT-AP interpretation, which must not be confusing.
A reused DCAT-AP property for a certain class fits a use case of the new profile. However, there is a controlled vocabulary attached to the property for this class in DCAT-AP, and it is not fit for purpose.
AS_NEEDED and
NOT_PLANNED items were added to the
Frequency
controlled vocabulary
that MUST be used with Dataset
frequency.
In DCAT-AP, the expectations of controlled vocabularies
that MUST be used
with listed properties of specified classes are that if the property is to be used with that class,
even in DCAT-AP profile,
it is to be used with the DCAT-AP indicated controlled vocabulary. This applies unless otherwise
specified in a usage note,
e.g. in the case of the property dct:publisher and the
EU Vocabularies Corporate
bodies Named Authority List.
When, for a certain property used with a class, values from a controlled vocabulary MAY be used or ARE RECOMMENDED to be used, alternative controlled vocabularies are allowed or tolerated instead of values from the original one. When the property MUST have at least one value from a controlled vocabulary, values from other controlled vocabularies are allowed only in addition to the specified one.
An existing DCAT-AP property fits the use case of the new profile we want to address. However, a different range is required. For example:
xsd:decimal.There can be several types of range incompatibilities, and what can or cannot be done in each case depends on the exact nature of the incompatibility resulting from the combination of expected range types and the ones defined in DCAT-AP.
xsd:gYear, xsd:gYearMonth,
xsd:date or xsd:dateTime is allowed.
xsd:date is used, which is OK.xsd:gDay can be used. This
is, however, not possible,
as that datatype is not expected by existing DCAT-AP implementations.dcterms:conformsTo
is used to indicate An implementing rule or other specification of a
dataset. This can be any standard.
The profile is domain-specific, where only a limited set of standards is
admissible. This set will be represented
as a profile-specific controlled vocabulary, which is OK.eli:LegalResource. In a profile, instances of
another class, skos:Concept
are used as values of the property. The effect is that the instances will be
treated also as instances of
eli:LegalResource in DCAT-AP implementations.
eli:LegalResource "This class represents the
legislation, policy or policies that lie behind the Rules that govern the
service."
does not conflict with the instances of skos:Concept used in the
profile, this is OK.eli:LegalResource,
then this is not compliant with DCAT-AP.Existing DCAT-AP implementations expect value types (datatype, class instance, controlled vocabulary) specified for existing DCAT-AP properties. Narrowing down the value space will not create any issues but extending it will.
When the use case resolution requires definition of a new property, its domain and range should be defined as well. If this property is to be used in multiple contexts, the question is, what should be its range.
DCAT-AP applies the guideline to use properties at their most concrete usage context. For example,
the
applicable
legislation of a Dataset
is used at the level of a Dataset with the definition The legislation that mandates the creation or
management of the Dataset.
Then, it is used again,
at the level of a Data Service,
as The legislation that mandates the creation or management of the Data Service, reflecting the
specifics of the context
given by the class on which it is used. However, both usages are of the same property
r5r:applicableLegislation.
This property is defined separately, in the DCAT-AP
Vocabulary,
with the more generic definition the legislation that is applicable to this resource and the
generic domain rdfs:Resource.
This is done so that the property itself is reusable in multiple contexts. Then, its reuse can be
further described in the
specific context such as Dataset or Data Service.
Another example can be seen in DCAT-AP Issue #384 for the property spatial resolution, where one property is used in three different contexts, differently each time.
The same principle is suggested for properties define in DCAT-AP profiles, i.e. a reusable definition of the property in a vocabulary, and then description of its context-specific usage in the profile documentation.
http://data.europa.eu/r5r
namespace as an example.dcat:Resource for catalogable resources, or rdfs:Resource for generic resourcesrdfs:Resource, or a very generic classrdfs:Literal, or rdf:langString (to ensure multilinguality is always supported)dcat:Resource. Write status of the catalogued resource instead.This application is not different from the usage of any other property already present in DCAT-AP. Use the same reuse steps as described in previous situations to document the usage as precise as possible.
Considering the DCAT-AP profile and the supporting vocabulary as two distinct specifications, each with their own lifecycle, removes the design conflict (detailed usage context versus generic reusable context) profile builders experience.
A DCAT-AP property has a cardinality different than required by the new profile.
1..1, 1..*)
1..1,
0..1)
0..1 to 0..*, which was escalated as a
DCAT-AP issue.Existing DCAT-AP implementations expect values for mandatory properties and treat properties with number of values limited to 1 as such. Applications might implement simple values as simple attributes, not expect arrays in code. SPARQL queries may not aggregate over variables which are expected to have at most one value or may expect values for mandatory properties. These may break if the constraints are violated.
A new DCAT-AP profile describes the use of DCAT-AP in a new usage context. The generic DCAT-AP descriptions for the classes, such as Catalogue, Dataset, Distribution, ... may need to be contextualised for the profile to be supportive for the profile audience.
Example: In the new profile context, the audience calls Data Interface what in DCAT-AP is called a
Distribution.
In the new profile, the class dcat:Distribution will therefore have the Data Interface
label,
which will be used throughout the profile document.
Another related challenge may occur influencing the classes' usage notes or definitions. The profile's usage context applies to the whole document. It, however, may happen that the profile combines several distinct usage contexts together.
Example: I want to have a different set of mandatory properties for Distribution when it represents a downloadable file, and a different set of mandatory properties for Distribution when it represents a data service. Then in the usage notes of all properties, I need to repeat sentences like If the Distribution represents a Data Service, this is mandatory. If the Distribution represents a downloadable file, this is not mandatory.
Without taking any measures, integrating distinct usage contexts into one class leads to more complex and longer usage descriptions. And, therefore, increasing the risk for misinterpretations.
Each profile must express its motivation to exist. That scope reflects into the usage scoping of the classes. When a profile combines multiple usage scopes in one document, profile editors must either generalise the usage scopes to fit all cases, either create subclasses or sub profiles to document the requirements for each usage scope.
Reasoning example: If one would merge DCAT-AP with its annex DCAT-AP HVD into one document having one formal representation of the two usage contexts (outside of HVD and within HVD) then one can either:
However, if the usage context supersedes a single class, and requires more support, then it is advised to create an additional, separate sub-profile of the currently created profile, describing that usage context. This is the motivation for creating the DCAT-AP HVD as annex (profile) of DCAT-AP, as there are specific HVD related requirements not only for Datasets, but also for Data Services and Distributions.
Related issue https://github.com/SEMICeu/GeoDCAT-AP/issues/111.
There is a use case that can be addressed by reusing an external specification to describe a related resource. The specification already contains extensive documentation on how to describe that resource. Should this description be replicated in the new profile, or is linking to it enough?
Example: In DCAT-AP, a standard, to which a Dataset conforms, is linked using
Dataset conforms
to property,
which has dcterms:Standard as a range.
The class Standard is then described separately, but no specific properties are included in the DCAT-AP specification for this class.
There are no specific requirements on how the linked standards should be described in DCAT-AP.
On the other hand, in GeoDCAT-AP, the class Standard has specific properties from Dublin Core and other vocabularies listed, specifying expectations that GeoDCAT-AP applications might have on Standards used in the GeoDCAT-AP context.
The question is whether these properties should be present in GeoDCAT-AP metadata records (e.g. in a catalogue) for every standard linked, or whether it is expected that a repository of Standards, which may exist independently of the catalogue, will use these properties.
# GeoDCAT-AP record
<#service> a dcat:DataService ;
dct:conformsTo <http://www.opengis.net/def/crs/OGC/1.3/CRS84> .
# Should the following be in the GeoDCAT-AP record, or in a separate repository of standards?
<http://www.opengis.net/def/crs/OGC/1.3/CRS84> a dct:Standard ;
dct:identifier "urn:ogc:def:crs:OGC:1.3:CRS84"^^xsd:anyURI ;
dct:title "CRS84"@en ;
dct:type <http://inspire.ec.europa.eu/glossary/SpatialReferenceSystem> ;
skos:inScheme <http://www.opengis.net/def/crs/OGC> ;
skos:prefLabel "CRS84"@en .
An orthogonal problem is whether these expectations are influenced by the Standard having an IRI or not, and that IRI being or not being dereferenceable.
# GeoDCAT-AP record, standard without IRI
<#service> a dcat:DataService ;
dct:conformsTo [ a dct:Standard ;
dct:identifier "urn:ogc:def:crs:OGC:1.3:CRS84"^^xsd:anyURI ;
dct:title "CRS84"@en ;
dct:type <http://inspire.ec.europa.eu/glossary/SpatialReferenceSystem> ;
skos:inScheme <http://www.opengis.net/def/crs/OGC> ;
skos:prefLabel "CRS84"@en .
] .
Standards with non-dereferenceable IRIs or standards not identified by IRIs might have to be described by the properties, if they are to be identifiable.
Both situations also relate to how the profile owner positions the data specification to its audience: as a specification in the landscape of other profiles or as an implementation guideline. In the first perspective it is natural to reference to other specifications for the semantics and guidelines, while the implementation perspective draws in decisions based on the interpretation of these referenced specifications. That eases the work for implementers but makes the profile more dependent on an implementation perspective.
DCAT-AP profiles should be designed as much as possible from a generic profile perspective, staying close to the core classes of DCAT-AP. This approach is about defining the common ground for both perspectives. From this core, an extension towards implementation support can be made easily.
For the following steps we assume object properties.
By focussing on the needs for your use case rather than including all the original information of a reused asset: your users can focus on the essentials, the readability of the specification is safe guarded, and it eases the burden on creating interoperability.
In this section, we have covered the following less frequent, more complex situations, in addition to the common ones in :