Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use RDF Dataset definition from RDF Concepts #152

Merged
merged 5 commits into from
Oct 31, 2024
Merged

Conversation

rubensworks
Copy link
Member

@rubensworks rubensworks commented Aug 28, 2024

Closes #28


Preview | Diff

@rubensworks rubensworks added the spec:editorial Minor change in the specification (markup, typo, informative text; class 1 or 2) label Aug 28, 2024
@rubensworks rubensworks requested review from kasei, afs, Tpt and hartig August 28, 2024 08:47
spec/index.html Show resolved Hide resolved
@afs
Copy link
Contributor

afs commented Aug 28, 2024

We should align to the RDF Concepts definition - that is to be expected; it was spec timing that caused the divergence.

It is not editorial though because of the blank node change.

The consequences are limited by the fact that allowing blanks in syntax mostly has no consequence because the blank node is not one in the graph. GRAPH []/ GRAPH _:b could be allowed with that blank node being a variable like it is in BGPs.

An info box in the RDF dataset could mention this.

To work though though the specs:

  1. Should FROM NAMED allow a blank node?
  2. Should GRAPH blanknode be allowed?
  3. Update: should INSERT DATA syntax allow a blank node? It would be a fresh blank node.
  4. Update: should LOAD allow dataset forms.

The main use case for blank node named graph comes from loading RDF syntax forms that allow it.

Currently, I only see point 4 as necessary.

See w3c/sparql-update#41

@rubensworks rubensworks added spec:substantive Change in the spec affecting its normative content (class 3) –see also spec:bug, spec:new-feature and removed spec:editorial Minor change in the specification (markup, typo, informative text; class 1 or 2) labels Aug 28, 2024
@hartig
Copy link
Contributor

hartig commented Aug 28, 2024

Where is the notion of an "RDF Dataset Merge" actually used in the spec? I searched the HTML source code for "defn_RDFDatasetMerge" but couldn't find it anywhere else.

@rubensworks
Copy link
Member Author

Where is the notion of an "RDF Dataset Merge" actually used in the spec? I searched the HTML source code for "defn_RDFDatasetMerge" but couldn't find it anywhere else.

You seem to be right. I also can't find any direct usages.
There are multiple mentions of "RDF merge of the graphs" using the "RDF Merge" definition from RDF semantics: https://w3c.github.io/sparql-query/spec/#unnamedGraph https://w3c.github.io/sparql-query/spec/#exampleDatasets
But since "RDF Dataset Merge" is not directly used (anymore?), we may want to consider removing it.

@afs
Copy link
Contributor

afs commented Aug 28, 2024

Where is the notion of an "RDF Dataset Merge" actually used in the spec? I searched the HTML source code for "defn_RDFDatasetMerge" but couldn't find it anywhere else.

In update, maybe. It, or the graph store equivalent, will be if LOAD takes quads.

Even if it's not, I think we need a definition somewhere because of the blank shared between graphs. It could go in RDF concepts if rewritten to be "concepts" style.

@hartig
Copy link
Contributor

hartig commented Aug 28, 2024

  1. Should FROM NAMED allow a blank node?

I would say no because I am not sure what this would even mean.

One potential interpretation might be to understand such a blank node as something like a variable, as is the case for blank nodes in BGPs. However, this interpretation doesn't make much sense as SPARQL also doesn't allow something like FROM NAMED ?v.

Another potential interpretation might be to understand such a blank node as a way to refer to a named graph that is named with that blank node in the underlying RDF dataset of the triple store. Yet, this wouldn't work because the blank node cannot be given directly but must be written via a blank node identifier, e.g.,

FROM NAMED _:b

which the SPARQL parser would then turn into a fresh blank node rather than some blank node that is used as a graph name in some dataset.

  1. Should GRAPH blanknode be allowed?

Probably not. It has almost the same problem as FROM NAMED. That is, trying to use this blank node as a graph name does not work because the blank node needs to be written via a blank node identifier. The alternative interpretation (using such a blank node as a form of "anonymous", non-projectable variable) has very little value as the blank node identifier via which this blank node is written should be constrained to be different from the blank node identifiers used in any BGP of the query (exactly as is the case for blank node identifiers in different BGPs).

  1. Update: should INSERT DATA syntax allow a blank node? It would be a fresh blank node.

I don't immediately see a problem with that.

Notice, however, that DELETE DATA is a different story.

  1. Update: should LOAD allow dataset forms.

I think that's an orthogonal question.

A related question that is not orthogonal is whether the INTO GRAPH feature of LOAD should allow a blank node. If it was allowed, the result should always be a new named graph with a new blank node as name (because the blank node identifier via which such a blank node at INTO GRAPH is written would be parsed into a fresh blank node).

@hartig
Copy link
Contributor

hartig commented Aug 28, 2024

Where is the notion of an "RDF Dataset Merge" actually used in the spec? I searched the HTML source code for "defn_RDFDatasetMerge" but couldn't find it anywhere else.

In update, maybe. It, or the graph store equivalent, will be if LOAD takes quads.

Update has it's own notion, see 5.2.1 Dataset-UNION Notice that this one is slightly different from "RDF Dataset Merge" in Query. In particular, Dataset-UNION is defined in terms of set union of graphs whereas "RDF Dataset Merge" is defined in terms of the graph merge operation of RDF.

Even if it's not, I think we need a definition somewhere because of the blankc shared between graphs. It could go in RDF concepts if rewritten to be "concepts" style.

Yes, RDF Concepts seems to be a more suitable place for such a definition.

@afs
Copy link
Contributor

afs commented Aug 28, 2024

  1. Should FROM NAMED allow a blank node?

I would say no because I am not sure what this would even mean.

OK - I was getting ahead of myself and imaging an AS.
Nothing for FROM NAMED.

@afs
Copy link
Contributor

afs commented Aug 28, 2024

7. Update: should LOAD allow dataset forms.

I think that's an orthogonal question.

A related question that is not orthogonal is whether the INTO GRAPH feature of LOAD should allow a blank node. If it was allowed, the result should always be a new named graph with a new blank node as name (because the blank node identifier via which such a blank node at INTO GRAPH is written would be parsed into a fresh blank node).

Good point. It has use for loading data, preparing it, then updating the dataset. There is also the order dataset operations CLEAR, CREATE, DROP, COPY, MOVE, ADD, because one request can be several operations that share a blank node scope.

We don't want to pile too much work - we could agree a design on sparql-update issues where it can done, if time, else leave to the ongoing WG after the 2024-2026 charter end.

Recorded for now : w3c/sparql-update#42

This PR can be for "RDF Dataset" in SPARQL Query only.

spec/index.html Outdated Show resolved Hide resolved
@rubensworks
Copy link
Member Author

All comments have been resolved (except for the RDF Dataset Merge discussion which has been deferred to #155).

Note also the change in the (now informal) RDF dataset definition which also allows for graphs to be identified by blank nodes, instead of just IRIs.

@rubensworks rubensworks requested review from afs and gkellogg September 18, 2024 10:58
spec/index.html Outdated Show resolved Hide resolved
Co-authored-by: Ted Thibodeau Jr <tthibodeau@openlinksw.com>
@hartig
Copy link
Contributor

hartig commented Sep 23, 2024

Thanks @rubensworks for resolving the comments.

Do we still want to remove the definition of the notion of "RDF Dataset Merge" as considered above?

@rubensworks
Copy link
Member Author

Do we still want to remove the definition of the notion of "RDF Dataset Merge" #152 (comment)?

I've created an issue for this (#155). Since there are some differing opinion, it may be better to discuss it there. (maybe even to be discussed during one of our weekly meeting)

@hartig
Copy link
Contributor

hartig commented Sep 24, 2024

I've created an issue for this (#155).

Ah, I must have missed that one. Perfect. Then I am fine as well with merging this PR.

@afs
Copy link
Contributor

afs commented Oct 31, 2024

It was discussed and agreed in the workgroup meeting of 2024-10-31 to move the definition of dataset merge in RDF Semantiucs (section 10)

  • Put a definition of dataset merge in RDF Semantics (section 10)
  • Remove from SPARQL query
  • No links in SPARQL docs to fix up

@afs afs merged commit 78300a6 into main Oct 31, 2024
2 checks passed
@afs afs deleted the change/rdf-dataset-defn branch October 31, 2024 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec:substantive Change in the spec affecting its normative content (class 3) –see also spec:bug, spec:new-feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RDF 1.1 Alignment - RDF Term and RDF Dataset
7 participants