Position of OPTIONAL influences result #4989
-
Current BehaviorData:
Query with trailing optional (works as expected);
Query with leading optional omits the second solution;
Expected Behaviorboth results should be the same (ignoring ordering) Steps To ReproduceSelf-contained reproduction class: https://gist.github.com/fkleedorfer/71f805ddcfc9f2dc5676fe0775ed11d3 Version4.3.6, 4.3.11 Are you interested in contributing a solution yourself?None Anything else?No response |
Beta Was this translation helpful? Give feedback.
Replies: 11 comments
-
This was tested using the MemoryStore, i.e.
|
Beta Was this translation helpful? Give feedback.
-
I checked the sparql validator and these two queries produce quite different plans, even after optimisation. Essentially there is a scoping issue. The ?s from the type statement isn't in scope of the optional if the optional is written first. |
Beta Was this translation helpful? Give feedback.
-
From the spec: "In an optional match, either the optional graph pattern matches a graph, thereby defining and adding bindings to one or more solutions, or it leaves a solution unchanged without adding any additional bindings." In your case the query with the optional clause first will define new solutions, but since it's before the type pattern it won't see the results from that pattern. The reason it works when the optional is after the type pattern is because of: "or it leaves a solution unchanged without adding any additional bindings." With the optional clause first though, there won't be any solutions to leave unchanged since there are no previous solutions. |
Beta Was this translation helpful? Give feedback.
-
Are you trying to say, this works as intended? GraphDB, for one, does not agree, I believe. Will run more checks. |
Beta Was this translation helpful? Give feedback.
-
I just tested the example with jena 3.16.0 and GraphDB 10.6.2
GraphDB produces the results I was expecting in both cases, ie both times
|
Beta Was this translation helpful? Give feedback.
-
From reading the spec and looking at the W3C SPARQL Validator, it looks like RDF4J and Jena are both working according to spec. GraphDB looks like they are incorrect. |
Beta Was this translation helpful? Give feedback.
-
It may be according to spec, but it is very counter-intuitive. Maybe that is why GraphDB changed it - it only invites errors without any advantage to the user whatsoever. |
Beta Was this translation helpful? Give feedback.
-
Try to read the query sequentially where OPTIONAL means "for the above query patterns add the following query patterns if they exist". Might be that it's a bug with GraphDB. |
Beta Was this translation helpful? Give feedback.
-
GraphDB deviates from the specification on this issue due to the fact that the benefit from optimizing the position of the optional outweighs the fact that if you place it first it will ignore results that do not match the pattern initially + as you mention it is the more intuitive resolution. |
Beta Was this translation helpful? Give feedback.
-
Preferably semantics of queries are in line with the intuition of most users indeed. I don't know whether that's the case here though. For me these two queries are pretty clear and clearly different :) "Find statements with type ?s rdf:type ex:Car .
OPTIONAL { ?s ex:hasColor ?o . } "Find statements for { }
OPTIONAL { ?s ex:hasColor ?o . }
?s rdf:type ex:Car . (the Whether deviating from the semantics actually allows for an optimisation eludes me. You need some (minimal) extra branching perhaps for the second query, but other than that I'd say you can do the same type of (block) nested loops, hash joins or whatever it is best given the cardinalities. |
Beta Was this translation helpful? Give feedback.
From the spec:
"In an optional match, either the optional graph pattern matches a graph, thereby defining and adding bindings to one or more solutions, or it leaves a solution unchanged without adding any additional bindings."
In your case the query with the optional clause first will define new solutions, but since it's before the type pattern it won't see the results from that pattern.
The reason it works when the optional is after the type pattern is because of: "or it leaves a solution unchanged without adding any additional bindings." With the optional clause first though, there won't be any solutions to leave unchanged since there are no previous solutions.