Skip to content
This repository has been archived by the owner on Apr 13, 2022. It is now read-only.

Mention willful violation of the sparql-update spec #193

Closed
wants to merge 3 commits into from

Conversation

michielbdejong
Copy link
Contributor

No description provided.

api-rest.md Outdated Show resolved Hide resolved
api-rest.md Outdated Show resolved Hide resolved
api-rest.md Show resolved Hide resolved
@kjetilk
Copy link
Member

kjetilk commented Jun 19, 2019

Also note that I have proposed a feature for this in SPARQL 1.2: w3c/sparql-dev#60

If we add a discussion item, then it might be worth mentioning, but in general, I don't think that belongs in the spec.

@michielbdejong
Copy link
Contributor Author

yeah we should find a better wording because 'under discussion' implies that it's a discussion item that's not part of the current spec.

@RubenVerborgh
Copy link
Contributor

I mean that piece of text we add when something isn’t fully clear yet (cfr. globbing etc.) but fine without; it’s a draft spec anyway.

@michielbdejong
Copy link
Contributor Author

@RubenVerborgh comments applied

RubenVerborgh
RubenVerborgh previously approved these changes Jun 20, 2019
Copy link
Contributor

@RubenVerborgh RubenVerborgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good, thanks!

api-rest.md Outdated Show resolved Hide resolved
@michielbdejong
Copy link
Contributor Author

@timbl can you review?

@michielbdejong
Copy link
Contributor Author

Two thoughts:

  • as @kjetilk brought up, this gives a client with Write access a chance to partially discover what data was there. For instance, if I send a sparql-update with 'DELETE , INSERT ', and it succeeds, then I can infer that the <some> <secret> <boolean> triple existed. I guess the same is true in POSIX:
$ mkdir test-dir
$ chmod 222 test-dir/
$ touch test-dir/file.txt
$ ls -la
total 0
drwxr-xr-x   3 michiel  staff   96 Jan 15 09:27 .
drwxr-xr-x  15 michiel  staff  480 Jan 15 09:26 ..
d-wx-wx-wx   3 michiel  staff   96 Jan 15 09:27 test-dir
$ ls -la test-dir/
ls: : Permission denied
$ rm test-dir/what.txt
$ rm: test-dir/what.txt: No such file or directory
$ rm test-dir/file.txt
$ 
  • I think a PATCH should implicitly create the file if it doesn't exist (the server does that as a side-effect), so saying that this needs to happen through an LDP POST or PUT is confusing (implies that maybe the client should already have done it beforehand).

@RubenVerborgh
Copy link
Contributor

RubenVerborgh commented Jan 15, 2020

  • this gives a client with Write access a chance to partially discover what data was there. For instance, if I send a sparql-update with 'DELETE , INSERT ', and it succeeds, then I can infer that the <some> <secret> <boolean> triple existed.

We must definitely prevent that from happening.

However, note that:

  • DELETE currently requires Read permission for that reason
  • INSERT DATA (without a WHERE clause) MUST NOT fail with 409
  • INSERT with a WHERE clause requires Read permission

As such, I do not think that the case described above can happen; no 409 can ever be returned to an agent without Read permissions.

Relevant code.

@michielbdejong
Copy link
Contributor Author

Ah ok, so that resolves it; I tried this out with

curl -X 'PATCH' -H 'Content-Type: application/sparql-update' -d 'DELETE DATA { <#s> <#p> <#o8> }' https://michielbdejong.inrupt.net/public/public-write/test.ttl

And indeed, the DELETE requires both Write and Read.

@kjetilk
Copy link
Member

kjetilk commented Jan 15, 2020

The way the semaphore issue was described is that concurrent clients do delete and insert, and if the second delete doesn't match anything, then it should fail, and that failure would be used as a semaphore. That would be the same if DELETE DATA was used, if that didn't delete anything, the same semaphore could be derived from that, as it would be derived from a DELETE INSERT WHERE query form.

I have not fully appreciated the issue, as I haven't seen the queries. However, from @michielbdejong 's formulation

A WHERE clause should result in exactly one result mapping.

That also suggests to me that a triple rather than a triple pattern is what should have been used. I think we need to have a deeper exploration of the problem space and the solution space to design this mechanism.

Also, note that DELETE DATA, INSERT DATA and INSERT DELETE WHERE are different query forms, with rather different properties in SPARQL. As such, there is no such thing as

INSERT without a WHERE clause MUST NOT fail with 409

INSERT always has a WHERE clause, as evident from the syntax.

@RubenVerborgh
Copy link
Contributor

RubenVerborgh commented Jan 15, 2020

That also suggests to me that a triple rather than a triple pattern is what should have been used.

No, it's a triple pattern. The SPARQL UPDATE spec allows zero or more matches; current NSS mandates exactly one match, or fails with 409.

INSERT always has a WHERE clause

Indeed; adjusted above.

@kjetilk
Copy link
Member

kjetilk commented Jan 15, 2020

That also suggests to me that a triple rather than a triple pattern is what should have been used.

No, it's a triple pattern. The SPARQL UPDATE spec allows zero or more matches; current NSS mandates exactly one match, or fails with 409.

I know it is a triple pattern, but perhaps that's flaw, not a feature.

@RubenVerborgh
Copy link
Contributor

but perhaps that's flaw, not a feature.

Flaw of NSS or of SPARQL UPDATE?

The NSS behavior is deliberate (although I strongly disagree with it).

@kjetilk
Copy link
Member

kjetilk commented Jan 15, 2020

There are really many discussions, one is what subset of SPARQL Update should be required for Solid, another is the semaphore mechanism, and then there are more general discussions around transactions in SPARQL, both single-request atomicity and multi-request atomicity.

We find ourselves in quite a few of these discussions. I think that SPARQL Update has significant shortcomings that is unhelpful when designing the semaphore mechanism that we want.

However, I'm also thinking that we need to put more effort into understanding the queries that are being used, and see if there are alternative ways to write them, which again could inform both the semaphore mechanism and the required subset for SPARQL.

@RubenVerborgh
Copy link
Contributor

I think that SPARQL Update has significant shortcomings that is unhelpful when designing the semaphore mechanism that we want.

I can agree with that point. We might want a different patch format altogether for semaphores, perhaps N3-based. That said, the more the server should support, the harder it is.

However, I'm also thinking that we need to put more effort into understanding the queries that are being used, and see if there are alternative ways to write them,

We probably want to start a thread for that. I had looked at them and thought of a way to construct your own semaphore using a sequence of queries; but NSS does not support a sequence (and atomicity would also be required).

The interesting use case here is basically concurrent structured document editing. When a user changes value A into B, then B should only be set if another user has not changed A to C in the meantime. And that a document can consist of thousands of triples, so we don't want to lock the entire document.

@kjetilk
Copy link
Member

kjetilk commented Jan 15, 2020

We probably want to start a thread for that.

Yes, and I would have preferred to do that in general for SPARQL, which is what I started in w3c/sparql-dev#60, but I feel that I need to understand the actual problem better.

Perhaps we should develop it first in the context of Solid, and then see if that can inform the design in SPARQL 1.2.

The interesting use case here is basically concurrent structured document editing. When a user changes value A into B, then B should only be set if another user has not changed A to C in the meantime. And that a document can consist of thousands of triples, so we don't want to lock the entire document.

Yes, indeed, and exactly that A, B and C are probably known to the client and can be expressed in triples makes me think that the actual case lends itself well to the DELETE DATA, INSERT DATA query form.

@RubenVerborgh
Copy link
Contributor

A, B and C are probably known to the client

The existence and value of C are not, that's the entire issue.

@kjetilk
Copy link
Member

kjetilk commented Jan 15, 2020

The existence and value of C are not, that's the entire issue.

Ah, right, I was imprecise, because it doesn't need to be, since it doesn't occur in the delete, which is what the semaphore is based on.

Basically, what happens is that client 1 goes:

DELETE DATA { <foo> <baz> "Dahut" } ;
INSERT DATA { <foo> <baz> "Bar" }

independently, client 2 goes

DELETE DATA { <foo> <baz> "Dahut" } ;
INSERT DATA { <foo> <baz> "Foobar" }

The triple <foo> <baz> "Dahut" is known to both clients, whereas the second triple is known only to the respective clients. Therefore, you want a semaphore at the ; that fails the entire request with a 409 for the second case. That's my current understanding.

Actually, the idea I've been toying with is to take some more bashism, e.g.

DELETE DATA { <foo> <baz> "Dahut" } &
INSERT DATA { <foo> <baz> "Bar" }

So, & is a binary operator like in bash, which means, only insert if the delete succeeded... It remains to define what failure means, since now it always succeeds.

@RubenVerborgh
Copy link
Contributor

But that assumes we want a delete, whereas we also might want to insert when something (still) exists. However, I suppose you could just delete and re-add it.

@kjetilk
Copy link
Member

kjetilk commented Jan 15, 2020

But that assumes we want a delete, whereas we also might want to insert when something (still) exists. However, I suppose you could just delete and re-add it.

I didn't quite understand. Either, you'd just do a INSERT DATA, or you get a conflict, update and retry?

I suppose there will be room for more operators than & too :-)

@kjetilk
Copy link
Member

kjetilk commented Jan 16, 2020

While trying to sleep last night, I had some ideas that could solve many problems. But to address it in a structured manner, I propose we activate the Query Panel with its own repository and issue tracking.

@RubenVerborgh
Copy link
Contributor

RubenVerborgh commented Jan 16, 2020

I propose we activate the Query Panel with its own repository and issue tracking.

Yes, please. Good idea!

Just to finish up:

Either, you'd just do a INSERT DATA, or you get a conflict, update and retry?

The case is:

INSERT { <a> <b> <c>. }
WHERE  { <x> <y> <z>. }

So "if this (syntactically unrelated) thing is still there, add this other thing".

We could

DELETE DATA { <x> <y> <z>. }
INSERT DATA { <a> <b> <c>. <x> <y> <z>. }

but that might be a bit silly. (And it breaks when there are blank nodes.)

@kjetilk
Copy link
Member

kjetilk commented Jan 16, 2020

Ah, right, I get it. Yes, that too would be covered by my idea, now for writing up.

@kjetilk
Copy link
Member

kjetilk commented Jan 16, 2020

@justinwb
Copy link
Member

While trying to sleep last night, I had some ideas that could solve many problems. But to address it in a structured manner, I propose we activate the Query Panel with its own repository and issue tracking.

@kjetilk @RubenVerborgh a repo for the query panel has been created along with appropriate permissions assigned to the panel members that had originally added themselves (you were both already there). have also created a corresponding gitter channel.

@kjetilk
Copy link
Member

kjetilk commented Jan 17, 2020

Great, thanks a lot, @justinwb !

I have now entered several issues in there. The one closely corresponding to this PR is solid-contrib/query-panel#3 .

Perhaps we should close this PR now, so that further discussion will be directed there?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants