Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: Are built-in prefixes in metadata required, or recommended? #365

Closed
joeflack4 opened this issue May 20, 2024 · 3 comments
Closed

Docs: Are built-in prefixes in metadata required, or recommended? #365

joeflack4 opened this issue May 20, 2024 · 3 comments

Comments

@joeflack4
Copy link
Contributor

joeflack4 commented May 20, 2024

Overview

Just a request to have documentation on this question:
Are built-in prefixes in metadata required, or recommended?

Additional info

Thoughts
I prefer to be minimalist to the point that I want prefixes in my metadata if and only if they appear in mapping set / any of the rows in my TSV. But I do see the UX benefits of including common built-ins.

Related

@matentzn
Copy link
Collaborator

@gouttegd has already documented this here:

https://mapping-commons.github.io/sssom/spec/#tsv

The YAML metadata block MUST contain a curie map that allows the unambiguous interpretation of CURIES. A curie map is supplied after a curie_map: parameter in the yaml file. The value is a dictionary of CURIE->URLPREFIX pairs. Note that the following prefixes are built-in and (1) MUST NOT be changed from their SSSOM default interpretation and (2) MAY be omitted from the curie map: "sssom", "owl", "rdf", "rdfs", "skos", "semapv".

So, you are right, they are optional and sssom-py is a bit obsessive adding these optionals.

@gouttegd
Copy link
Contributor

I plan to clarify that even further in my upcoming¹ overhaul of the spec by defining a canonical TSV serialisation.

The canonical serialisation is the serialisation that SSSOM/TSV writers will be recommended to adopt in order to minimise serialisation differences across implementations, to avoid possibly huge and meaningless diffs when a SSSOM/TSV file is modified by different tools. It’s a generalisation of the logic by which we are already recommending that TSV columns should be written in a spec-defined order.

Regarding the prefix map, the “canonical” guideline will be that SSSOM/TSV writers should write the minimal effective prefix map – that is, the prefix map should only contain prefix names that (1) are not already built-in, and (2) are effectively used somewhere in the mapping set.

But that’s only a recommendation for writers. For SSSOM/TSV readers, all that matters is that (1) there are no prefix names in the set that are not declared in the prefix map, with the exception of the built-in prefix names, and (2) if the built-in prefix names are declared, they must point to the same prefixes as in the spec.

So it is never wrong for a set to have a prefix map that contains superfluous prefix names (names that are built-in and/or that are never used in the set), and readers must never reject a set because of that. It is simply recommended that writers avoid generating such maps.

--
¹ Yes, it is coming up. At some point. Eventually.

@joeflack4
Copy link
Contributor Author

Thanks for the helpful responses! This is already done, then! Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants