Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The range of inSubset should be an IRI, not a literal #1527

Open
cmungall opened this issue Jul 31, 2024 · 10 comments · Fixed by #1528
Open

The range of inSubset should be an IRI, not a literal #1527

cmungall opened this issue Jul 31, 2024 · 10 comments · Fixed by #1528
Assignees

Comments

@cmungall
Copy link
Member

The range of inSubset should be an IRI, not a string literal. We have a mix at the moment

valid:

AnnotationAssertion(<http://www.geneontology.org/formats/oboInOwl#inSubset> <http://purl.obolibrary.org/obo/ENVO_00001998> <http://purl.obolibrary.org/obo/envo#EnvO-Lite-GSC>)

not valid:

AnnotationAssertion(<http://www.geneontology.org/formats/oboInOwl#inSubset> <http://purl.obolibrary.org/obo/ENVO_00000887> "wwfBiome")

These are the counts of invalids

environmental_hazards|90
envoAstro|72
envoAtmo|111
envoCesab|5
envoCloudAtlas|10
envoCmecs|36
envoCryo|40
envoEOVs|9
envoEmpo|37
envoMarine|62
envoMeo|27
envoNceas|22
envoOmics|35
envoPlastics|69
envoPolar|442
nlcd2011|22
subset_siren|37
wwfBiome|41

We already have IRIs defined for the subsets - nothing links to them so far

image

Note: this issue should not be used for discussing whether we use inSubset or a different AP. We have a different mega-ticket for this: #1202. Fixing the current ranges does not preclude using a different AP in the future, and would be a good incremental step towards this goal.

@turbomam
Copy link
Contributor

turbomam commented Dec 17, 2024

As of http://purl.obolibrary.org/obo/envo/releases/2024-07-01/envo.owl, the inSubset axioms still use a mixture of IRIs and string literals.

also

  • some of the literals have inconsistent datatypes
  • there's almost no overlap between the IRIs used in inSubset axioms and the subset property IRIs defined in the ontology

@turbomam
Copy link
Contributor

turbomam commented Dec 17, 2024

SELECT ?o (COUNT(?s) AS ?s_count) 
(IF(isIRI(?o), "IRI", "Literal") AS ?type)
WHERE {
    ?s <http://www.geneontology.org/formats/oboInOwl#inSubset> ?o .
}
GROUP BY ?o

@turbomam
Copy link
Contributor

turbomam commented Dec 17, 2024

?o ?s_count ?type
http://purl.obolibrary.org/obo/envo#EnvO-Lite-GSC 20 IRI
http://purl.obolibrary.org/obo/po#Angiosperm 2 IRI
http://purl.obolibrary.org/obo/po#Arabidopsis 1 IRI
http://purl.obolibrary.org/obo/po#CL 2 IRI
http://purl.obolibrary.org/obo/po#Gymnosperms 2 IRI
http://purl.obolibrary.org/obo/po#Maize 2 IRI
http://purl.obolibrary.org/obo/po#Musa 1 IRI
http://purl.obolibrary.org/obo/po#Poaceae 1 IRI
http://purl.obolibrary.org/obo/po#Potato 1 IRI
http://purl.obolibrary.org/obo/po#reference 7 IRI
http://purl.obolibrary.org/obo/po#Rice 2 IRI
http://purl.obolibrary.org/obo/po#Tomato 1 IRI
http://purl.obolibrary.org/obo/po#TraitNet 21 IRI
http://purl.obolibrary.org/obo/RO_0002259 12 IRI
http://purl.obolibrary.org/obo/ro/subsets#ro-eco 50 IRI
http://purl.obolibrary.org/obo/valid_for_go_annotation_extension 18 IRI
http://purl.obolibrary.org/obo/valid_for_go_gp2term 10 IRI
http://purl.obolibrary.org/obo/valid_for_go_ontology 14 IRI
http://purl.obolibrary.org/obo/valid_for_gocam 19 IRI
environmental_hazards 90 Literal
envoAstro 72 Literal
envoAtmo 109 Literal
envoAtmo@en 2 Literal
envoCesab 5 Literal
envoCloudAtlas 1 Literal
envoCloudAtlas@en 9 Literal
envoCmecs 36 Literal
envoCryo 40 Literal
envoEmpo 37 Literal
envoEOVs 9 Literal
envoMarine 62 Literal
envoMeo 27 Literal
envoNceas 22 Literal
envoOmics 35 Literal
envoPlastics 31 Literal
envoPlastics@en 36 Literal
envoPlastics@es 1 Literal
envoPlastics@zh 1 Literal
envoPolar 441 Literal
envoPolar@en 1 Literal
nlcd2011 22 Literal
subset_siren 36 Literal
subset_siren@en 1 Literal
wwfBiome 41 Literal

@turbomam
Copy link
Contributor

turbomam commented Dec 17, 2024

PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?s (GROUP_CONCAT(?c;
        SEPARATOR=" | ") AS ?comments)
WHERE {
    ?s rdfs:subPropertyOf+ <http://www.geneontology.org/formats/oboInOwl#SubsetProperty> .
    OPTIONAL {
        ?s rdfs:comment ?c
    }
}
GROUP BY ?s
order by ?s

@turbomam
Copy link
Contributor

turbomam commented Dec 17, 2024

@turbomam
Copy link
Contributor

turbomam commented Jan 10, 2025

I think I missed an important point when I re-opened this issue on December 17th

#1527 (comment)

I was complaining that the string-object in-subset statements were still present... but that's because I was looking at the most recent https://github.com/EnvironmentOntology/envo/blob/master/envo.owl, which doesn't contain the merged un unreleased contents of branch issue-1527 which was merged into in src/envo/envo-edit.owl by

@turbomam
Copy link
Contributor

I do see this change from branch issue-1527 in envo-edit.owl in the main branch now:

from
AnnotationAssertion(<http://www.geneontology.org/formats/oboInOwl#inSubset> <http://purl.obolibrary.org/obo/ENVO_00000012> "envoPolar")

to
AnnotationAssertion(<http://www.geneontology.org/formats/oboInOwl#inSubset> <http://purl.obolibrary.org/obo/ENVO_00000012> <http://purl.obolibrary.org/obo/envo#envoPolar>)

@turbomam
Copy link
Contributor

OK, this is actually in a very good state now except for these, as determined by src/envo/reports/envo-subsetTable.tsv

?label ?subset ?definition ?URI
greenhouse gas envoPolar   http://purl.obolibrary.org/obo/CHEBI_76413
piece of plastic envoPlastics A mass of solid material which is primarily composed of plastic. http://purl.obolibrary.org/obo/ENVO_01000776
thermosetting disposition envoPlastics A disposition which inheres in materials capable of becoming rigid when cured by heating. http://purl.obolibrary.org/obo/ENVO_06105004
food (pickled) subset_siren A food preserved by soaking and storing it in vinegar or brine. http://purl.obolibrary.org/obo/FOODON_00001079
methanogenesis envoPolar   http://purl.obolibrary.org/obo/GO_0015948
photosynthesis envoPolar   http://purl.obolibrary.org/obo/GO_0015979
anaerobic respiration, using ammonium as electron donor envoPolar   http://purl.obolibrary.org/obo/GO_0019331
aerobic respiration, using nitrite as electron donor envoPolar   http://purl.obolibrary.org/obo/GO_0019332
aerobic respiration, using ammonia as electron donor envoPolar   http://purl.obolibrary.org/obo/GO_0019409
aerobic respiration, using ferrous ions as electron donor envoPolar   http://purl.obolibrary.org/obo/GO_0019411
aerobic respiration, using hydrogen as electron donor envoPolar   http://purl.obolibrary.org/obo/GO_0019412
aerobic respiration, using sulfur or sulfate as electron donor envoPolar   http://purl.obolibrary.org/obo/GO_0019414
ecological community envoPolar A multi-species collection of organisms of at least two different species, living in a particular area. Must have at least two populations of different species as members. http://purl.obolibrary.org/obo/PCO_0000002

@turbomam
Copy link
Contributor

The build process is still able to create a fresh src/envo/subsets/envoPolar.tsv with ~ 670 rows. I still don't understand that process, since 'envoPolar' is mentioned by name in the definition of $(SUBSETS) in src/envo/Makefile, but I'll keep studying.

@turbomam
Copy link
Contributor

probably because I don't know anything about OWLTOOLS

subsets/%.owl: subsets/envo-basic.obo
	$(OWLTOOLS) $< --extract-ontology-subset --fill-gaps --subset $* -o $@.tmp && \
	$(ROBOT) annotate -i $@.tmp --ontology-iri $(ONTBASE)/$@ --version-iri $(ONTBASE)/releases/$(TODAY)/$@  -a owl:versionInfo $(TODAY) \
	  --output $@

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants