-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Queries involving blank nodes return Bad Request error. #22
Comments
This may be a Comunica bug indeed. |
Part of resolving LDflex/LDflex-Comunica#22 as blank nodes of the form `nodeID://1234` from Virtuoso need to be treated as absolute IRI's when going through this parser.
Part of the issue is that the configuration of the engine is converting blank nodes to variables (https://github.com/comunica/comunica/blob/bf5e6a2a0807188bd0bab224da1746b9f48c5d33/packages/actor-rdf-resolve-quad-pattern-sparql-json/lib/ActorRdfResolveQuadPatternSparqlJson.ts#L34-L68) which in this particular use case, and IMO most use cases for LDflex, is unexpected behavior (as a matter of fact I don't quite follow why this conversion is being done in the default configuration of the comunica engine either) so this needs to be changed somehow. The second issue is that Virtuoso expects blank nodes to be of the form
I created a PR in SPARQL.js so that this doesn't get resolved as a relative IRI - but in addition comunica needs to convert the blank node from the skolemized format to this format. I think the best way of doing this would be to create some actors for specific SPARQL endpoints (virtuoso, fuseki etc.) that can handle this as well as other components of the endpoints that are not following the SPARQL spec correctly. If you think this is the correct approach I make a fork to create these actors. Regarding https://github.com/comunica/comunica/blob/45b015588dce4723ac9259314723ca4e598f6417/packages/actor-rdf-resolve-quad-pattern-federated/lib/FederatedQuadSource.ts#L80-L132 - one could strip off the |
This is a standard operation in SPARQL query engines.
AFAICS, the only problem then seems to be the SPARQL parsing issues, which you seem to have fixed already in RubenVerborgh/SPARQL.js#124. |
Thanks for the clarifications! The patch allowed me to successfully run the following query with the skolem IRI (using the default setup for the comunica engine with nodejs)
the original query
still gets converted to
I'll make a PR later that strips off the The problem then becomes - running the following code async function showShape(shape: any) {
console.log(`This person is ${await shape.label}`);
for await (const property of shape.property) {
**console.log("path associated to property", `${await property.label}`)**
}
} would produce fairly unintuitive results as calling In the short term I can just use Named Nodes for each property in my data. Long term it would be good to have this handled in some way - in cases where one is working with an endpoint that uses skolem IRI's, I guess a specialised comunica actor could test and ensure that the
query is issued. If this is not the case I guess something like
would have to be issued by LDflex (similarly to what is sent if you queried Has there been any kind of discussion surrounding how LDflex should handle these kinds of issues with the semantics of blank nodes? |
If this query is obtained, then Comunica must be receiving
The history of this feature can be read here: LDflex/Query-Solid#34 |
I will investigate what is going on and read up on LDflex/Query-Solid#34 - thanks! |
Indeed - these are the raw bindings that Virtuoso will send to comunica. That is, Virtuoso identifies them as Blank Nodes. Hence this piece of code https://github.com/comunica/comunica/blob/bf5e6a2a0807188bd0bab224da1746b9f48c5d33/packages/actor-rdf-resolve-quad-pattern-federated/lib/FederatedQuadSource.ts#L87-L93 ingests the terms as such Based on this - to me it seems the options to resolve are
becomes the valid query
in addition - create an actor that uses the context being passed around to identify that the SPARQL endpoint is from an instance of Virtuoso and sends use the priority mediator to obtain a custom actor (
to an endpoint hosted by Virtuoso it will becomes
|
Let me just summarize this issue for the sake of future reference: SummaryLDflex allows data lookups via separate chained SPARQL query executions. Comunica already takes care of this for document-based sources, but this is not possible to apply for SPARQL endpoints in a generic manner. For Virtuoso specifically, it may be possible to support this, as Virtuoso allows you to pass blank node labels as IRIs in a determitistic manner (e.g. And here's my reply to the comment above:
If I understand correctly, you would basically intercept the response from here, and rewrite each bnode result so that it becomes a skolem IRI. We could for example hook into |
Not exactly sure whether this best belongs here, in the comunica repo, or possibly even the LDflex repo, depending on how you want to go about solving the issue.
When using LDflex to query over a SHACL constraint in a repo we are hosting - I want to use the following pattern.
Because each property in the shape is a blank node
property
is a blank node so callingawait property.label
cases the queryto be issued by LDflex.
With the current engine setup this query eventually gets passed through the
replaceBlankNodes
function withinActorRdfResolveQuadPatternSparqlJson
and hence the query that actually gets sent by the engine isThis has invalid variable names so the SPARQL endpoint returns an invalid variable name error
But more to the point; the blank nodes should not be changed to variables here. I'm guessing either the query engine config needs to be fixed here; there is some bug in comunica engine.
The text was updated successfully, but these errors were encountered: