Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returning modified data graph instead of validation report? #189

Closed
devonsparks opened this issue Jul 7, 2023 · 5 comments
Closed

Returning modified data graph instead of validation report? #189

devonsparks opened this issue Jul 7, 2023 · 5 comments

Comments

@devonsparks
Copy link

Is there a way to get back the inferred triples (from pre-inferencing and SHACL rules) instead of the validation report?

I thought I might be able to read the target_graph of Validator, like this (here only demoing RDFS pre-inferencing):

from pyshacl import Validator
from rdflib import Graph

g = Graph()

smts = """
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix C: <http://example.org/> .
@prefix ex: <http://example.org/> .

C:A rdfs:subClassOf C:B . 
ex:something a C:A .
"""

g.parse(data=smts)
v = Validator(g,
      inference='rdfs',
      advanced=True)

v.run()
print(v.target_graph.serialize(format='ttl'))

But the result is unchanged:

@prefix C: <http://classes.org/> .
@prefix ex: <http://example.org/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

ex:something a C:A .

C:A rdfs:subClassOf C:B .

# Expecting to see 
#       <ex:something> a <C:B> 
# inferred 

Similar results for SHACL Rules.

Thanks in advance!

@ajnelson-nist
Copy link
Contributor

I thought, in part from the discussions around inoculation, that the data graph would have been modified. But I tried your code and saw v.data_graph serialized the same before and after v.run().

I'm interested in an answer to this thread too, because I'm interested in seeing how to extract the triples generated from SHACL (Advanced Features) Rules.

@devonsparks
Copy link
Author

Reading through the Validator code, I also tried passing inplace=True, and inspecting the data_graph after the run. No change. I can confirm the trouble isn't with either the RDFS or SHACL Rules I've set, because they work fine in Protege and TopBraid's SHACL engine respectively. Also tried separating rules and ontology graphs from the data graph to no effect. Will keep digging through the code to try to grok where the output is being kept/lost, but appreciate insight from anyone more familiar with the code base.

@ashleysommer
Copy link
Collaborator

ashleysommer commented Jul 8, 2023

Hi @devonsparks

This question is another duplicate of #20, that has been asked many times and answered many times (#20, #78, #148), with additional discussion in #60. In short, PySHACL is just a SHACL Validation engine, its purpose is to validate a datagraph against given SHACL shapes and constraints in accordance with the W3C SHACL Specification, and return a validation result and a validation report. It does OWL/RDF Inferencing/entailment, and Expansion using SHACL Rules internally for the purposes of validating the graph, but it does not make the expanded graph available to the user. It is even part of the SHACL Spec that the validator should not modify the data graph as part of its validation. So to accommodate that, PySHACL takes an internal copy of the input datagraph, and performs any modification required on that only. That is why the original datagraph is unmodified.

It appears that you have already dug that far, because you have tried the unofficial hacks to work around this limitation, including checking validator.target_graph (as discussed in #20) and passing inplace=True (as discussed in #78). These workarounds do work, however you have an error in your code, you are constructing Validator with the wrong options. (The options to construct a Validator object instance are a dict, unlike the parameters to pyshacl.validate() helper.)

This should work using the internal target-graph method:

from pyshacl import Validator

v = Validator(g, shacl_graph=myshapes, options={"advanced": True, "inference": "rdfs"})
conforms, report_graph, report_text = v.run()
expanded_graph = v.target_graph #<-- This gets the expanded data graph

This should work using the unofficial inplace modifier method:

from pyshacl import Validator

v = Validator(g, shacl_graph=myshapes, options={"advanced": True, "inplace": True, "inference": "rdfs"})
conforms, report_graph, report_text = v.run()
g #<-- g is expanded inplace

or using the validate() helper function:

from pyshacl import validate

conforms, report_graph, report_text = validate(g, shacl_graph=myshapes, advanced=True, inference="rdfs", inplace=True)
g #<-- g is expanded inplace

If you believe PySHACL should be more than a validation engine, and have an alternate mode in which PySHACL acts as a general purpose entailment/rules expander, please discuss that in #60.

Note, I see you are not passing a SHACL Shapes file in your examples. When you do that, PySHACL searches the datagraph for Shapes. It it doesn't find any, it doesn't run validate anything. I'm not sure if that might also be a factor in the unexpected results you are seeing.

@ashleysommer
Copy link
Collaborator

Hi @devonsparks
Can you confirm the above solves your issue? Can this thread be closed now?

@devonsparks
Copy link
Author

Hi @ashleysommer - Yes, this does seem to resolve it. Apologies for not finding the duplicates sooner. I'd taken a look, but must have neglected to filter on closed issues. I will continue to discuss on #60. Okay to close.

Given the number of folks that seem to ask about this, maybe worth putting as an FAQ in the README? I'll raise on #60 for further discussion. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants