-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
standardize behavior of nested brackets and UNION clauses #103
Comments
First thing, GraphDB and RDF4J are doing the wrong thing, because SPARQL evaluates nested queries from inside-out (a/k/a bottom-up). Then -- what version of Virtuoso (output of I ask because I'm not seeing your results from Enterprise/Commercial Edition v08.03.3315 (9c4e63d226) built 2019-10-25 (behind URIBurner.com) --
-- nor Enterprise/Commercial Edition v07.20.3232 (2afcc45d7c) built 2019-08-09 (behind DBpedia.org) --
-- nor Open Source Edition (VOS) v07.20.3229 (17c4ba1) built 2018-08-17 (behind UniProt.org) --
With the obvious exception, these results are what should be delivered. |
@VladimirAlexiev, @TallTed is correct that the GraphDB/RDF4J behavior is incorrect here. In fact we have recently done several fixes in RDF4J's SPARQL engine to correct for these kinds of scoping corner cases (see for example eclipse-rdf4j/rdf4j#1405). If you'll try your second query in the latest version of RDF4J (3.0.2) you'll see that it in fact returns no binding for The reason these kinds of scoping gotchas are hard for the RDF4J engine, by the way, is that its engine was originally designed as an iterative "top-down" evaluation mechanism. For most SPARQL queries, this is not a real issue as evaluation order has no real influence on the eventual result, and we have made corrections to cater for the more obvious cases where evaluation order does make a difference in scoping (there are several such cases in the DAWG query test suite, all of which RDF4J correctly evaluates). Clearly though, we haven't quite caught them all yet. |
@TallTed I tested on the DBpedia endpoint.
But these are not queries, they are just brackets (q2) or UNION clauses (q1). About the inside-out strategy: SPARQL also says that implementations are free to optimize, implying the situation where triple pattern results will be joined between the two levels. The problem in this case is that BIND doesn't perform a join but an assignment... So: outer bindings are not visible in a set of brackets. This semantics may be correct, but is very non-intuitive and IMHO totally useless. @afs @ericprud, could you please comment? |
@VladimirAlexiev what are you suggesting here? Changing the specification more to your liking? Intuitiveness is subjective. You should really be looking at an algebra representation (e.g. on http://sparql.org):
|
@VladimirAlexiev -- There are already channels for bug reporting. When there are differences in implementations, the first step is to discuss with the implementers or use public-rdf-dawg-comments@w3.org. w3c/rdf-tests is the place to propose tests but bind/bind07.rq looks like it covers the matter. "inside-out" is in fact what you are used to : (1+3)*4 = 16 Thank you to @jeenbroekstra for explaining RDF4J. Jena works by starting with a join tree and then tries to find better ways to evaluate. One important optimization is finding if a query can be execute iteratively (less memory, usually faster -- a form of index join with globally scoped variables).This area has been the scene of several bugs. Optimizations aren't always easy. SPARQL defines the correct answers. Implementations can do what they like but the correct results are well defined. |
Optional elements of the syntax certainly make it appear that these are "just" clauses, but they are in fact subqueries. Braces I cannot immediately explain how you got your reported results on the DBpedia endpoint. You can see my results from that endpoint above. Can you provide live links that get your reported results? |
@TallTed https://www.w3.org/TR/sparql11-query/#subqueries are different from https://www.w3.org/TR/sparql11-query/#GroupPatterns. (Note: you can navigate the grammar easily here: http://rawgit2.com/VladimirAlexiev/grammar-diagrams/master/sparql11-grammar.xhtml#GroupGraphPattern) The translation of the two is also different https://www.w3.org/TR/sparql11-query/#sparqlTranslateGraphPatterns. I get
I don't think it does. It's clear that the bindings between two UNION clauses are independent. My objection is why bindings from the outer scope are not used in the inner. But anyway, I see I'm in the wrong, though I still think this is a horrible mis-feature of SPARQL. Closing the issue. |
The query is:
and ?z does not get bound either arm of the union because ?o is not available in the union. Whether BIND or pattern matching on the left side of the join makes no difference. |
this issue is referenced recently in a discussion about the expected behaviour of rdf4j. consider this variant of the query which introduced this issue:
the evaluation order does not affect the result - the scoping rules do. |
The scoping rules are a syntax issue and may be useful to explain evaluation as well. In the above example, the query isn't legal syntax - the example tries to https://www.w3.org/TR/sparql11-query/#variableScope (Note: the example has been edited since. It was:
|
independent of the lexocograhic error in the initial version, above, is the following query also illegal "syntax"?
|
That's OK. It is two attempts to bind a variable in the same row that is not allowed (not via a join). The union is two different rows. It can be joined with a |
please re-prase this:
as it reads, neither does it refer to a "syntax issue" nor is the meaning of "not via a join" clear.
|
That is legal, there two parts, echo adding a binding to an empty solution. The |
i agree that it is legal. i tried it here.
where i understand that implementation to be correct in that the first variant yields a result under sparql binding semantics, but the second does not. |
Why?
I'm getting data from a root object, and a bunch of branches stemming from it that may include multiple values at any level.
I want to use UNION to avoid Cartesian product (returning all combinations of values from the different branches). This applies not only to SELECT, but also to CONSTRUCT and INSERT which afaik work over the result-set of a corresponding SELECT, and would be slowed down from the Caresian explosion.
Query
Consider this simple query that accesses no data
And even its simpler variant that doesn't use UNION
Differences
37000 Error SP031: SPARQL compiler: The list of return values contains '*' but the pattern does not contain variables
cc @kidehenRDF::Query
returns these errors @kaseiIMHO GraphDB and rdf4j are doing the right thing
The text was updated successfully, but these errors were encountered: