-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[New metadata element]: status #345
Comments
triggered by #309 Note that I have used |
I like it. I agree about the re-use of Also agree that collapsing |
Of note, though, regarding your example: the |
I like it as well. I am wondering if we should consider a more descriptive name like @dr-shorthair would you like to join the SSSOM developer team and make the PR towards this field yourself? Do you have any experience with LinkML (no problem if not, I can hold hands). |
The list of possible values for the status slot should probably be discussed a bit more with other people interested, though. In particular, I’m not sure about the need for a What would it mean, then, to have a mapping with a status other than |
@gouttegd I propose using When you are looking at a mapping_set, for each set of mappings with a specific ( |
Fine by me.
I have no experience with LinkML. |
OK - I can buy that. Corrected above. |
But why would you want to include all the different steps of the submission and review process in the published mapping set? Transparency and traceability is nice, but I don’t think it should go that far. Recording the history of a new mapping is the role of a version control system, the entire history doesn’t need to appear in the published set.
This is quite a breaking change. Currently, it is considered perfectly normal for a mapping set to have more than one record (one row) for any given mapping, because any given mapping can have multiple justifications (e.g., one asserted by manual curation, one asserted by semantic similarity, another asserted by semantic similarity but with a different tool, etc.). Now, only the last record should be considered valid (previous records are the history of the last record), so it’s no longer possible to have several justifications for one mapping. |
However, the current SSSOM element usage is ambiguous and far from complete. If we were to try to keep such information in a single mapping record, then a much much wider record would be required. The proposal is for a more scalable approach, using multiple records per mapping, as suggested by @gouttegd, where I'm adding a But there has to be some way to know which records are valid for use. This must involve some combination of status and date. I might not have it quite right, but that is my rationale for trying. |
Starting to work on your proposal now; my stance on this specific question is that it is out of scope for SSSOM - we provide the metadata for the client to make the decision about "which records are valid", and the mapping set provider can, simply, publish a mapping set which only contains the valid mappings. Stay tuned on a basic PR to continue the discussion. |
Here is a first PR to address the issue: #347, lets be strict about the implementation and get this right. |
Thanks @matentzn . Perhaps the use-case I sketched above is out of scope for SSSOM. I was led toward the multiple-rows pattern by this comment from @gouttegd - but maybe I extrapolated too far. While there may be local applications of SSSOM that head in this direction, they do not need to be part of the SSSOM canon until there is more testing. Nevertheless, I think the narrow proposal, for a I find the need for a status flag in almost every bit of design work that I do, sooner or often later, and I'm forever scratching around for a suitable slot and enumeration to fit. It feels like one of those basic elements that should always be added early, like |
I was indeed initially favourable to the idea – until I started thinking about the practical implications. Perhaps you can clarify one thing: is your use case about storing the history of a mapping record (a), or about storing its current status (b)? In other words:
Your initial example clearly suggests (a), but maybe this was just for the example? I don’t think I have any concerns with (b). But I do have concerns if the use case is about (a). I believe managing history is out of scope for a Simple Standard for Sharing Ontological Mappings, and it opens several thorny questions that must be addressed if we are ever to do that. |
I accept that external consumers only want to see the 'current' state of the mapping - i.e. one row per mapping, use-case (b). Having a (Behind the scenes, we might persist with some form of (a). This would be mostly so that we can leverage the rigour of SSSOM - i.e. predicate, justification, confidence etc - without asking non-technical subject-matter experts to step outside of the tabular/spreadsheet paradigm. Keeping it all within one artefact - the spreadsheet - is important for the user-community I am working with. We can then filter on the latest-date prior to making it visible externally. But we can write our own rules for all that. ) |
Element id (e.g. creator_id, mapping_tool_version):
Value data type (e.g. URI, URL, text, xsd:boolean):
Description
Indicate the status of this individual mapping. This allows a submission/review/publication/retirement lifecycle to be tracked using a set of mappings with the same subject/predicate/object.
Standard status values should be established for SSSOM. I suggest
Example description.
Complete example to a SSSOM file with this element
The text was updated successfully, but these errors were encountered: