Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SPARQL grammar for reifier syntax #150

Merged
merged 5 commits into from
Aug 29, 2024
Merged

Conversation

afs
Copy link
Contributor

@afs afs commented Aug 16, 2024

@afs afs requested review from rubensworks, kasei, Tpt and hartig August 16, 2024 10:28
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
@hartig
Copy link
Contributor

hartig commented Aug 16, 2024

Andy, thanks for adapting the grammar! I have added a few comments in my review. Additionally, I have the suspicion that there is an issue in the part of the grammar accepted by the ObjectPath production (the link here is directly into the preview of this PR).

For instance, the production accepts the expression

<< ?s ?p ?o ~ ?id >> ~ ?id2

because it accepts a GraphNodePath followed by an AnnotationPath, where the GraphNodePath in this case is the ReifiedTriple string (i.e., << ?s ?p ?o ~ ?id >> ) and the AnnotationPath is the Reifier string (i.e., ~ ?id2).

With that expression accepted by the ObjectPath production, one may write a triple pattern such as the following.

?x ?y  << ?s ?p ?o ~ ?id >> ~ ?id2 .

I don't think that makes sense, does it?
(I have to admit that, due to several weeks of vacation, I have not followed the recent discussions that let to the switch from | to ~.)

Other weird-looking things that may be written are

?x ?y  << ?s ?p ?o ~ ?id >> ~ ?id2 ~ ?id3 ~ ?id4 .

(because AnnotationPath has the *) and

?x ?y  << ?s ?p ?o ~ ?id >> ~ ?id2 {| :p : o |} ~ ?id3 .

@niklasl
Copy link

niklasl commented Aug 16, 2024

With that expression accepted by the ObjectPath production, one may write a triple pattern such as the following.

?x ?y  << ?s ?p ?o ~ ?id >> ~ ?id2 .

I don't think that makes sense, does it? (I have to admit that, due to several weeks of vacation, I have not followed the recent discussions that let to the switch from | to ~.)

Other weird-looking things that may be written are

?x ?y  << ?s ?p ?o ~ ?id >> ~ ?id2 ~ ?id3 ~ ?id4 .

(because AnnotationPath has the *) and

?x ?y  << ?s ?p ?o ~ ?id >> ~ ?id2 {| :p : o |} ~ ?id3 .

Those do look odd (and probably shouldn't be written like that), but they all (can) make sense. Here's that last one flattened a bit:

?x ?y ?id .
<< ?s ?p ?o ~ ?id >> .

<< ?x ?y ?id ~ ?id2 >> .
?id2 :p :o .

<< ?x ?y ?id ~ ?id3 >> .

Here's some data with a similar shape, with readable symbol names to provide an informal indication of meaning:

<Alice> :says << <Book> :mentions <Bob> ~ <speechact-1> >>
    ~ <recording-2> {| :mentions <Bob> |}
    ~ <recording-3> .

Which is short for:

<Alice> :says <speechact-1> .
<speechact-1> rdf:reifies <<( <Book> :mentions <Bob> )>> .

<recording-2> rdf:reifies <<( <Alice> :says <speechact-1> )>> .
<recording-2> :mentions <Bob> .

<recording-3> rdf:reifies <<( <Alice> :says <speechact-1> )>> .

(Not something I'd put in a primer, but it might perhaps help out here.)

@afs afs force-pushed the grammar-reifiers branch from 050d96f to 9913504 Compare August 16, 2024 14:00
@hartig
Copy link
Contributor

hartig commented Aug 16, 2024

Thanks @niklasl ! As mentioned, I was out for 4+ weeks and, thus, have missed the recent discussions that led to the change of the Turtle shorthand notations (switching from | as the separator to ~). You give the example:

<Alice> :says << <Book> :mentions <Bob> ~ <speechact-1> >>
    ~ <recording-2> {| :mentions <Bob> |}
    ~ <recording-3> .

Is this the agreed-upon version of the shorthand notation now?

@kasei
Copy link
Contributor

kasei commented Aug 16, 2024

Is this the agreed-upon version of the shorthand notation now?

And for those of us not joining the semantics TF calls, is there a summary of the new syntax? I've seen it mentioned in passing in emails and on the main WG calls, but don't recall seeing an introduction or summary of the new proposed syntax.

@niklasl
Copy link

niklasl commented Aug 16, 2024

@hartig @kasei I think the discussions originated with w3c/rdf-star-wg#116 (referenced a couple of times on the list). It was worked through more in detail in w3c/rdf-turtle#51.

There's now text and grammar in https://w3c.github.io/rdf-turtle/spec/#reified-triples (thanks to @gkellogg, @TallTed and @afs!).

(It wasn't really part of the semantics discussions IIRC.)

@hartig
Copy link
Contributor

hartig commented Aug 16, 2024

Thanks for the pointers Niklas! However:

There's now text and grammar in https://w3c.github.io/rdf-turtle/spec/#reified-triples (thanks to @gkellogg, @TallTed and @afs!).

The grammar in the Turtle doc is not updated yet, and the text talks only about the new version of the reifiedTriple production but not about the new version of the annotation syntax that your example uses. Yet, I see that there is an extensive discussion of this new version of the annotation syntax in the comments of PR w3c/rdf-turtle#51. So, I assume now that this PR here implements these changes for SPARQL and a similar PR for Turtle will follow. If that's the case, my earlier comment about the new ObjectPath production in this PR is obsolete.

@gkellogg
Copy link
Member

The grammar in the Turtle doc is not updated yet, and the text talks only about the new version of the reifiedTriple production but not about the new version of the annotation syntax that your example uses. Yet, I see that there is an extensive discussion of this new version of the annotation syntax in the comments of PR w3c/rdf-turtle#51. So, I assume now that this PR here implements these changes for SPARQL and a similar PR for Turtle will follow. If that's the case, my earlier comment about the new ObjectPath production in this PR is obsolete.

w3c/rdf-turtle#62 does update the annotation syntax, although there is only minimal narrative discussing this. In the case of the annotation syntax, we have the potential to have multiple reifiers and annotation blocks; this doesn't make sense for the reifiedTriple production as that's expected to generate just a single term when parsing.

@afs
Copy link
Contributor Author

afs commented Aug 18, 2024

?x ?y << ?s ?p ?o ~ ?id >> ~ ?id2 .

and (reifier declaration)
Note that the reifier declaration form does not allow a following reifier.

{ <<:s :p :o >> ~ :r . }

is not legal.

Temporarily, the sparql.org query validator is now running a development version with the latest SPARQL grammar and it will show the generated triples.

Testing has been "light" so far.

The RDF-star features are not fully propagated though the codebase. Only SPARQL parse-to-print is likely to work.

@kasei
Copy link
Contributor

kasei commented Aug 18, 2024

@hartig @kasei I think the discussions originated with w3c/rdf-star-wg#116 (referenced a couple of times on the list). It was worked through more in detail in w3c/rdf-turtle#51.

Thanks, @niklasl. I had missed the subsequent discussion of details (maybe overlooked because I wouldn't have thought the syntax across formats would be a Turtle PR…?), and so had wrongly assumed it was something that had been agreed to in the semantics TF call. Thanks for the pointers!

@hartig
Copy link
Contributor

hartig commented Aug 19, 2024

@afs

?x ?y << ?s ?p ?o ~ ?id >> ~ ?id2 .

[...]
is not legal.

In this case, my earlier comment about the ObjectPath production is actually relevant. That is, by the current definition of ObjectPath in this PR, the string above would be accepted by the TriplesBlock production.

@hartig
Copy link
Contributor

hartig commented Aug 19, 2024

[@gkellogg] w3c/rdf-turtle#62 does update the annotation syntax, although there is only minimal narrative discussing this.

You are right. I was looking at the current main-branch version of the Turtle spec that Niklas was pointing too.

[@gkellogg] In the case of the annotation syntax, we have the potential to have multiple reifiers and annotation blocks; this doesn't make sense for the reifiedTriple production as that's expected to generate just a single term when parsing.

I agree regarding the reifiedTriple production, and I can also see now that the annotation syntax in w3c/rdf-turtle#62 accepts multiple reifiers and annotation blocks, exactly as the ObjectPath production for SPARQL in this PR here. Now I wonder, however, why @afs says above that

?x ?y  << ?s ?p ?o ~ ?id >> ~ ?id2 .

is illegal. The grammar in this PR accepts it and, similarly, the grammar in w3c/rdf-turtle#62 accepts the following Turtle string.

:x :y  << :s :p :o ~ :id >> ~ :id2 .

@afs
Copy link
Contributor Author

afs commented Aug 19, 2024

Now I wonder, however, why @afs says above that

?x ?y  << ?s ?p ?o ~ ?id >> ~ ?id2 .

is illegal.

The declaration form is illegal. I didn't mean to say the ?x ?y << ?s ?p ?o ~ ?id >> ~ ?id2 . was illegal. My bad wording.

"""
Note that the declaration form, { <<:s :p :o >> ~ :r . }, is illegal.
"""
This is to keep use of declaration syntax being a single triple. That in turn means it is suitable for writing out Turtle while streaming.

@afs afs force-pushed the grammar-reifiers branch 2 times, most recently from f234ca0 to b3fcf9b Compare August 19, 2024 13:35
@hartig
Copy link
Contributor

hartig commented Aug 19, 2024

The declaration form is illegal. I didn't mean to say the ?x ?y << ?s ?p ?o ~ ?id >> ~ ?id2 . was illegal. My bad wording.

Okay, then we are in agreement regarding ?x ?y << ?s ?p ?o ~ ?id >> ~ ?id2 .

As for the declaration form ...

"""
Note that the declaration form, { <<:s :p :o >> ~ :r . }, is illegal.
"""

I see that the new SPARQL grammar in this PR does not accept this. In contrast, however, the new Turtle grammar in w3c/rdf-turtle#62 does! (without the curly braces, of course). In particular, the triples production in line 8 of spec/turtle.bnf is currently defined as follows.

triples     ::= subject predicateObjectList
              | blankNodePropertyList predicateObjectList?
              | reifiedTriple predicateObjectList?

The relevant case here is the third option (reifiedTriple predicateObjectList?) -- notice the ? at the end.

@afs
Copy link
Contributor Author

afs commented Aug 19, 2024

triples     ::= subject predicateObjectList
              | blankNodePropertyList predicateObjectList?
              | reifiedTriple predicateObjectList?

The relevant case here is the third option (reifiedTriple predicateObjectList?) -- notice the ? at the end.

I think that's OK (I don't have a turtle parser to check).

There must be a predicateObjectList to get to the annotation rule which is where ~:e and annotation {| |} are allowed after an object.

cc @gkellogg

@hartig
Copy link
Contributor

hartig commented Aug 19, 2024

Yes there must be a predicateObjectList to get to the annotation rule, but my comment now was not about the annotation syntax.

Instead, it was about only writing a string that matches the reifiedTriple production, as a standalone thing (rather than in the subject or object position of something else). By the current grammar in w3c/rdf-turtle#62, the following is a valid Turtle file.

<< :s :p :o ~ :r >> .

In contrast, the SPARQL grammar in this PR does not accept this in a WHERE clause, such as the following.

SELECT * WHERE {
<< :s :p :o ~ :r >> .   # <-- illegal
}

@afs
Copy link
Contributor Author

afs commented Aug 19, 2024

SELECT * WHERE {
<< :s :p :o ~ :r >> . # <-- illegal
}

That's legal. It does not have a refiier after the >>.

<< :s :p :o ~ :r1 >> ~:r2 .

A reifier after would apply to the triple from inside the <<...>>:

:r rdf:reifies <<(:s :p :o )>> .

and hence the syntax is two triples:

:r1 rdf:reifies <<( :s :p :o )>> .
:r2 rdf:reifies <<( :r1 rdf:reifies <<( :s :p :o )>>  )>> .

The annotation rule of Turtle includes both a following reifier for a triple and the {| .. |} -- both are optional.
This excludes a refier after the declaration, not a reifier in the declaration.

Illegal: reifier after:

{ << :s :p :o ~ :r >> ~:e }
{ << :s :p :o >> ~:e }

Legal:

{ << :s :p :o ~ :r >> }

Same for Turtle if I read the BNF right and predicateObjectList is necessary to get to the annotation syntax.

This will be easier when we have parsers and not have to read raw BNF!

For SPARQL, I've updated sparql.org.

@hartig
Copy link
Contributor

hartig commented Aug 19, 2024

SELECT * WHERE {
<< :s :p :o ~ :r >> . # <-- illegal
}

That's legal.

Ah, yes, I see now. I overlooked that the PropertyListPath in ReifiedTripleBlockPath may also be the empty string. Apologies for the noise.

This will be easier when we have parsers and not have to read raw BNF!

:-D Indeed!

@niklasl
Copy link

niklasl commented Aug 19, 2024

This will be eaiesr when we have parsers and not have to read raw BNF!

FWIW, I've updated LDTR (mainly a TriG-to-JSON-LD transcriber) to support this reifier syntax. Its LDTR demo app can be used to play with Turtle-star examples.

(It's made using a PEG-based parser generator, so it should be on par with the ENBF. There are some rough corners though, and both the interface and visualization needs more love. Also, over in the JSON-LD WG we're in the process of updating JSON-LD-star, so it's still quite experimental.)

@gkellogg
Copy link
Member

I've updated my distiller with latest versions of N-Triples, N-Quads, and Turtle parsers.

Note that another difference between the Turtle grammar and SPARQL grammar is that SPARQL allows the annotation delimiters without any actual predicate or object:

SELECT * WHERE {
  :s :p :o {| |} .
}

The Turtle grammar requires there to be a predicateObjectList.

:s :p :o {| :p1 :o1 |}

You can achieve the same affect using a reifier instead of an annotation block:

:s :p :o ~

@afs
Copy link
Contributor Author

afs commented Aug 19, 2024

The reason the SPARQL grammar has {| |} (empty body) is to help with machine generated syntax. It is a mild inconvenience to track "zero or non-zero".

I'm happy to go either way - they should align.

c.f. [ ] .

@afs afs force-pushed the grammar-reifiers branch from b3fcf9b to ada1536 Compare August 20, 2024 10:27
@afs
Copy link
Contributor Author

afs commented Aug 20, 2024

{| |} removed.

Copy link
Member

@TallTed TallTed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small, for clarity

spec/index.html Outdated Show resolved Hide resolved
afs and others added 2 commits August 28, 2024 19:04
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
@rubensworks rubensworks added the spec:substantive Change in the spec affecting its normative content (class 3) –see also spec:bug, spec:new-feature label Aug 29, 2024
@rubensworks rubensworks merged commit d0ebc5d into main Aug 29, 2024
2 checks passed
@rubensworks rubensworks deleted the grammar-reifiers branch August 29, 2024 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:substantive Change in the spec affecting its normative content (class 3) –see also spec:bug, spec:new-feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants