-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Where should my ontology go? Data graph versus shapes graph #185
Comments
I guess rdfs:subClassOf triples are what matters here. They impact sh:class and class-based targets. I believe we could change SHACL Core so that these triples will be considered from the union of data and shapes graphs. Would this address your concern or are there other triples in the data graph that should also be in the shapes graph and vice versa? |
Can't we define that all A union graph could make some edge cases, like validating constraints on a SHACL shape, difficult to process. A flag to enable that feature could solve it, but we should only consider it if there are other use cases than the |
If we were to ignore rdfs:subClassOf triples from the data graph then we would introduce a breaking change to SHACL, which is something we definitely want to avoid for this (incremental) release. Adding the shapes graph as an extra graph to process is less likely to break existing use cases. But even that is potentially breaking. |
An alternative solution is to introduce a new 3rd graph:
In SHACL 1.0 graph 3 is never given. In SHACL 1.1, it becomes possible to optionally specify graph 3. If graph 3 is specified, then all class and property statements (RDFS/OWL, including |
Related: #183 |
FWIW, I'm aware of at least one tool that follows this practice, keeping data and ontology graphs separate as tool inputs but mixing them in in-memory. On the other hand, I work with an ontology community that uses shapes as part of its ontology specification, keeping the ontology graph and shapes graph together. This makes significant use of Implicit Class Targets in co-typing That community happens to use that tool I noted, so the end result is two graphs (shapes graph S, and ontology graph O) reviewing three (shapes graph S, ontology graph O, and data graph D - and yes, S=S and O=O). I just wanted to leave this user story as "anecdata," which might or might not help @wouterbeek 's original guessed either-or:
I think we will learn the right way forward more from the inferencing work. My hunch is that the ontology and data graphs will typically have different update rhythms. If data updates, the ontology graph probably(?) wouldn't need to re-run inferencing. If the ontology graph updates, the data graph would probably need to re-run inferencing. |
Originally posed over at #155; also see the comments by others over there.
Observation
According to the SHACL standard, two graphs are relevant for validation: the data graph and the shapes graph. The ontology should be part of the data graph:
This seems counter-intuitive to me, since I associate the ontology more with the shapes graph. For example, a shapes graph can
owl:import
an ontology.Example
To illustrate my unease, let's take the following data graph:
And the following shapes graph:
Adding the following ontology graph is crucial, otherwise we cannot invalidate the data graph which is missing a
foaf:name
statement:Use case
I have a specific use case where this comes up: in TriplyETL we stream though the instance data. The stream passes along millions of small data graphs. For each of these data graphs, we have to add the ontology before the data graph can be validated in-stream. In this use case, it makes more sense to add the ontology to the shapes graph once, and use that same shapes graph to validate all data graphs that pass by.
Expected
I expect either of the following:
The text was updated successfully, but these errors were encountered: