-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardizing state changes in resources (history, undo, sync) #161
Comments
Missing above seems to be the issue of blank nodes and canonicalization (see Aidan Hogan and others).
Implemented by yours truly in https://github.com/solid/node-solid-server/blob/v5.2.2/lib/handlers/patch/n3-patch-parser.js (nodeSolidServer/node-solid-server#516)
Can you maybe be a bit more precise about the problem we are solving? Because storage is not a Solid concern (the specs only govern exchange). Is this about Because versioning itself is just Memento (and that design is part of server architectures). |
I'm mostly thinking about client-server (two way) communication, e.g. during collaborative document editing, but I think that standardized deltas can be useful in many contexts. And for many of these use cases, storing the deltas itself is important (auditability + P2P state replication = why git is awesome). Now I agree that we should not care for how these deltas should be stored (any solid server implementation can do whatever it likes), but providing a standard interface for accessing and appending these deltas is something that the spec maybe should cover. |
OK but then we should probably have collaborative editing as an issue/use case.
exposing; slight difference, but important in the Solid context, because the specs only govern the exchanges. |
I've just done a scrape of related use-cases from Solid and w3, and written it up on the forum, and indeed, it's not expressed directly. However solid/user-stories#22: "As a developer, I want to be able to subscribe to a stream of pod events" supports this ticket in general. |
Using NQuads for the deltas, and using the context (fourth) column as an "action" indicator only works if you're applying these deltas to a single graph, i.e., that your target is not working with Named Graphs, which at least some Solid servers (will) do. Something to consider as this suggestion moves forward... |
My colleague Thom suggested using a query parameter in the 'action' field, if you use named graphs. |
I'm happy to see N3 Patch has been invented and added to the spec. This is an interesting alternative to the earlier mentioned existing specs. I'm still a bit sceptical to relying on the Anyway, If n3 patches are persisted (and named) by a server, it becomes possible to construct verisons / a history. |
Solid specs standardize how to represent the current state of a resource (RDF, in some valid serialization format), but there is no part of the spec that describes how to store or share the deltas / changes in data / patches / transactions (I'm just calling them Deltas from now on).
Why store & standardize deltas
Having an append-only event log / ledger that describes every single mutation in a pod can provide some cool features:
One might argue that we don't need a standardized event system for most of these features - every solid server implementation could create their own way of dealing with versioning, for example.
However, I think standardizing this would improve data portability. If these events are standardized, the user can maintain undo / version history across Solid servers. And besides individual advantages, it would enable more powerful and performant data synchronization. Even very large resources could be updated incrementally, triple for triple.
And besides, since RDF is a relatively simple model, I think standardizing this will not be too complicated.
What the standard should define
What this means for clients
Currently, (most) Solid apps write to a pod by writing a full RDF resource. This works fine for smaller documents, but it becomes very inefficient and error prone when resources consist of more triples. Therefore, I think that clients should be able to send these state changes to their pod, and both the pod as the client app should be able to parse the delta and apply it to their RDF store.
Ways to standardize event logs
Some initiatives already exist that aim to standardize how deltas should be serialized and interpreted.
Some things to take in mind when considering (or designing) a delta standard:
RDF-Delta
This standard consists of two concepts: RDF Patch and the RDF Patch log. It introduces a new serialization format, similar to turtle, where you can add some letters before statements that encode for mutations. It also supports header items, e.g. to reference to previous commits.
An Apache Jena implemtation + CLI app already exist.
linked-delta
linked-delta is serialized in n-quads and uses the fourth column to semantically describe how a triple should be changed (e.g. update the value, add it, remove it). We created this and use it in our e-democracy application Argu to communicate state changes (when resource attributes change) between back-end and front-end.
The main benefit of this solution, is that it is light weight and does not require new serialization formats, and n-quads is the RDF serialization format that's the easiest to write a parser for. Since the fourth columns uses IRIs, the spec is inherently exetendible: any IRI can be added, which means that in the future we might come up with many other things than "add" or "replace". However, this might introduce complexity, since loaders (apps that playback the deltas) now might have to deal with unknown methods.
Some implementations exist: [Link-lib] (browser typescript), linked_rails (ruby on rails, server side),
This spec does not (yet) standardize the level above a set of quads - and I do think it makes sense to standardize how we denote who created a delta, whether it's signed, when it's created, what the previous hash is (to make a cryptographically valid ledger).
Currenlty, the order in which statements appear in a linked-delta document do not have any semantic meaning, and there are rules that determine in what order a parser (loader?) should interpret all delta statements.
N3 Patches
Tim Berners-Lee mentioned N3 Patches during a meeting some time ago, as an alternative, but I failed to find more about this.
LD-Patch
LD-Patch is a W3C working group spec that also introduces a new serialization language.
SRARQL updates
SPARQL-Update supports
INSERT
andDELETE
, so you could use these SPARQL Update strings to store deltas.Using PROV / other reification methods
Maybe the right way to store changes is to express it in RDF, perhaps use the PROV ontology for this.
This would of course eliminate the need for a new serialization format.
However, I feel like it should be trivial / really simple to convert these change statements into valid RDF.
Atomic Commits
see https://docs.atomicdata.dev/commits/intro.html
disclaimer: This is a design of my own.
It's a JSON based serialization of state changes, which allows for full traceability using cryptographic signatures. It's implemented and used in atomic server and atomic data browser. Only works with a strict subset of RDF.
TL;DR
Using deltas to communicate state changes is efficient and makes P2P state sharing easier. Storing deltas makes it easier to deal with backups, versioning, undo, and adding new query options. Various solutions exist, but perpahs we need something else.
Most importantly, we should pick one, and I'd love to hear your thoughts on this!
The text was updated successfully, but these errors were encountered: