1. Introduction
A Linked Data Event Stream (LDES) (ldes:EventStream) is a collection of immutable objects, each object being described using a set of RDF triples ([rdf-primer]).
This specification uses the TREE specification for its collection and fragmentation (or pagination) features, which in its turn is compatible to other specifications such as [activitystreams-core], [VOCAB-DCAT-2], [LDP] or Shape Trees. For the specific compatibility rules, read the TREE specification.
An ldes:EventStream is an rdfs:subClassOf the tree:Collection class.
It extends the tree:Collection class by saying all of its members are immutable, and you can thus only add members to the collection.
Note: When a client once processed a member, it should never have to process it again. A Linked Data Event Stream client can thus keep a list (or cache) of already processed member IRIs. A reference implementation of a client is available as part of the Comunica framework on NPM and Github.
<C1> a ldes: EventStream ; tree: shape <shape1.shacl> ; tree: member <Obervation1> . <Observation1> a sosa: Observation ; sosa: resultTime "2021-01-01T00:00:00Z" ^^ xsd:dateTime; sosa: hasSimpleResult "..." .
A tree:shape SHOULD be defined with the ldes:EventStream instance as its subject. The shape of the collection defines its members: it tells clients all old and new members of the stream have been and will be validated by that shape. As a consequence of the immutability of the members, this shape MAY evolve, but it MUST always be backwards compatible to the earlier version.
Clients MAY use the shape of the ldes:EventStream for prioritizing its source selection.
Note: When you need to change an earlier version of an ldes:EventStream, there are two options: create a new version of the object with a new shape that is backward compatible, and add the new version of that object again as a member on the stream, or replicate and transform the entire collection into a new ldes:EventStream. You can indicate that the new ldes:EventStream is derived from
<C2> a ldes: EventStream ; tree: shape <shape2.shacl> ; tree: member <AddressRecord1/version1> . <AddressRecord1/version1> dcterms: created "2021-01-01T00:00:00Z" ^^ xsd:dateTime; adms: versionNotes "First version of this address" ; dcterms: isVersionOf <AddressRecord1> ; dcterms: title "Streetname X, ZIP Municipality, Country" .
Note: in Example 1, we consider the Observation object ot be an immutable object and we can use the existing identifiers. In Example 2 however, we still had to create version IRIs in order to be able to link to immutable objects.
2. Fragmenting and pagination
The foces of an LDES is to allow clients to replicate the history of a dataset and efficiently synchronize with its latest changes. Linked Data Event Streams MAY be fragmented when their size becomes too big for 1 HTTP response. Fragmentations MUST be described using the features in the TREE specification. All relation types from the TREE specification MAY be used.
<C1> a ldes: EventStream ; tree: shape <shape1.shacl> ; # this shacl shape for as long as this collection exists will need to be backwards compatible. tree: member <Obervation1> , ... ; tree: view <?page=1> . <?page=1> a tree: Node ; tree: relation [ a tree: GreaterThanOrEqualToRelation ; tree: path sosa: resultTime ; tree: node <?page=2> ; tree: value "2020-12-24T12:00:00Z" ^^ xsd:dateTime] .
An tree:importStream MAY be used to describe a publish-subscribe interface to subscribe to new members in the LDES.
Note: A 1-dimensional fragmentation based on creation time of the immutable objects is probably going to be the most interesting and highest priority fragmentation for an LDES, as only the latest page, once replicated, should be subscribed to for updates. However, this may not always be the case: sometimes the back-end of an LDES server cannot guarantee that objects will be published chronologically. In that case, in the spirit of an LDES’ goal, the publisher should make sure as many pages as possible can be given an HTTP Cache-Control: public, max-age=604800, immutable header, and expose another kind of pagination.
Note: Cfr. the example in the TREE specification on “searching through a list of objects ordered in time”, also a search form can optionally make a one dimensional feed of immutable objects more searchable.
3. Retention policies
By default, an LDES MUST keep all data that has been added to the tree:Collection (or ldes:EventStream) as defined by the TREE specification.
It MAY add a retention policy in which the server indicates data will be removed from the server.
Third parties SHOULD read retention policies to understand what subset of the data is available in this tree:View, and MAY archive these member.
In the LDES specification, two types of retention policies are defined which can be used with a ldes:retentionPolicy with an instance of a tree:View as its subject:
-
ldes:DurationAgoPolicy: a time-based retention policy in which data generated before a specific time is removed -
ldes:LatestVersionSubset: a version subset based on the latest versions of an entity in the stream
Different retention policies MAY be combined. When policies are used together, a server MUST store the members as long they are not all matched.
3.1. Time-based retention policies
A time-based retention policy can be introduced as follows:
<Collection> a ldes: EventStream ; tree: view <> . <> ldes: retentionPolicy <P1> . <P1> a ldes: DurationAgoPolicy ; tree: path prov: generatedAtTime ; tree: value "P1Y" ^^ xsd:duration. # Keep 1 year of data
A ldes:DurationAgoPolicy uses a tree:value with an xsd:duration-typed literal to indicate how long ago the timestamp, indicated by the tree:path, of the members that can be found via a tree:View can be compared to the current time on the server.
3.2. Version-based retention policies
dcterms:isVersionOf:
<Collection> a tree: EventStream ; tree: view <> . <> ldes: retentionPolicy <P1> . <P1> a ldes: LatestVersionSubset ; ldes: amount 2 ; ldes: versionKey ( dcterms: isVersionOf ) .
A ldes:LatestVersionSubset SHOULD use two predicated: ldes:amount and the ldes:versionKey.
The ldes:amount has a xsd:nonNegativeInteger datatype and indicated how many to keep that defaults to 1.
The ldes:versionKey is an rdf:List of SHACL property paths indicating objects that MUST be concatenated together to find the key on which versions are matched.
When the ldes:versionKey is not set or empty, the key is empty, and all members MUST be seen as a version of the same thing.
<Collection> a tree: EventStream ; tree: view <> . <> ldes: retentionPolicy <P1> . <P1> a ldes: LatestVersionSubset ; ldes: amount 2 ; ldes: versionKey ( ( sosa: observedProperty ) ( sosa: madeBySensor ) ) .
