-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decide how to modularize GAZ such that individual subsets can be managed in github #21
Comments
@rctauber how were you planning to split things into modules? I see you have breakdown by country just now. Do you include everything that is located in a country, including geographic features such as lakes, rivers and the like? What bout features that overlap two countries? |
The country modules are everything that is related by either located_in or subClassOf. I'm not sure how overlapping features are handled currently in GAZ, but the modules would reflect that. We originally discussed starting with countries, then expanding to other subsets like oceans and seas. But, if overlapping features appear in multiple modules (and I imagine there will be overlap between things like counties and oceans and seas), it will be hard to make sure things stay up-to-date if we are using the modules to develop... |
@rctauber
As long as they're of different types, I think there shouldn't be conflicts in the subClassOf hierarchies. The RO:overlaps relation and its subproperties can be (is?) used to assert this sort of mereotopology. Even if these are in different modules, this should hold as long as there are some checks in place to make sure classes/instances are present across modules. On that note, @cmungall and I had several conversations over the years about the need to generalise spatial relations in ontologies like BSPO and RO to the planetary science case. I think GAZ will need these too. @cmungall time for an RO-geo subset? Branching off to #24 |
What about the 'located in' hierarchies, though? The modules include subclasses and located in. For example, say a river is located in two countries and we need to update the label of that river. Even if we check for 'overlaps', how do we know which one is newer? I guess I could write a script that takes the changes from the most-recently updated modules but it may get complicated.
Before we tackle the above problem, I think this is the more important issue. On another note, I can regenerate the modules from GAZ to keep them up-to-date, but I'm using a version of ROBOT that has a few unreleased features. The two main ones are improved templating and use of Jena's TDB feature to store a dataset on-disk (which makes querying infinitely faster). I'm pushing to get the updated templating merged in, then I need to make a PR for the Jena stuff. I don't want to include a custom ROBOT JAR in this repo since there are already many large files. As soon as these features are released, I can add the rules to the |
@rctauber
Not sure if I am totally following. This issue is about modularization rather than labels, it sounds like you may also be making unique labels? (see #26). But in answer to the main question, it should not be possible for an entity to be in RO:located-in two locations where those locations do not overlap (by definition). Thus if we choose non-overlapping units as the modules and placement in the modules is determined by located-in, then nothing should be in more than one module. But note:
|
Let me also state a few assumptions to check I'm on the same page as everyone:
|
Sorry, I wasn't super clear. I was just using that as an example if we wanted to update the label of an entity that existed in two modules. This wouldn't be a problem if we are able to define non-overlapping modules, as you suggest above. I agree with your stated assumptions. |
@rctauber going back to your comment from May 6. What are your plans for robot templates here? |
I don't have templates for the modules right now, but I can always make them if need be. I'm starting to see that ROBOT is having some trouble with any entities that are both named individuals and classes. For example,
and...
I'm trying to use |
I agree we should fix the punning first. My question was more along the lines of what you thought was best for the overall strategy. One possibility would be to maintain the entire ontology as a TSV and generate via robot template. I thought you might be thinking along these lines. There would be some definite advantages here. But it could be awkward editing the relational graph. And having mixed mode TSV and OWL may just add more complexity to what is already likely to turn into quite a complex build. It may be the case that we don't need to worry about templates just now and just focus on modularizing the OWL (but still, fixing the punning would be good) |
My plan was to modularize first, and then determine if we want to move to templates later. So I think we are in agreement there. I think we should discuss #20 on our next GAZ call and (perhaps) move forward on converting all those into individuals. Then, I could work on building a "bucket" that contains all the terms not in one of the country modules. |
The text was updated successfully, but these errors were encountered: