-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revise decision record: sorted_sequences #77
Comments
Ok, I can definitely clarify that rationale better, or tone down the dismissive language. It may actually be useful to try to dissuade people from using this, because, in my opinion, using this is a bad move that is less useful than using the other, better features of sequence collections. So, I think the dismissive language here may be warranted and useful and instructive, but I haven't effectively explained why/the alternative. To understand your use case, one thing I'm confused about: You're looking to compare coordinate spaces:
But then wouldn't you use A few other thoughts:
|
I toned down the dismissive language in the ADR :) |
A lot has changed here, and sorted_sequences is now defined in the spec, and also with the attribute endpoint is actually actively encouraged to be used. So I'm going to go ahead and close this issue since I think it's been addressed in the latest iteration of the spec, but feel free to re-open if you want to continue this discussion. |
https://github.com/ga4gh/seqcol-spec/blob/4722c713a9a5692b255f8bdf9ab27c18ca297042/docs/decision_record.md?plain=1#L19-L21
The above text conveys some opinions about the relative unimportance and utility of this feature.
From my perspective, using this feature is important, as it allows us to design federated applications with a reduced overhead on implementations: if a collection of sequences from which a dataset was generated can be universally identified by a GA4GH digest, based purely on sequence content (and not names, lengths, or order of sequences), then a one-time digest is sufficient for answering the question "do these resources represent variants in the same coordinate space?". I think it is much more likely that two implementations across a network will have the same underlying sequence content, but describe sequences in different orders or with different names (and therefore may not be able to compute
/comparison/:digest1/:digest2
ifdigest2
is an unrecognized permutation ofdigest1
by that instance). It therefore seems to me to that implementing this feature represents the most obvious case for describing Sequence Collections across a federated service network, instead of a case with limited / fringe utility.I recognize that as someone looking at this spec from the outside, I may not be grasping the larger importance of the comparison function or how Sequence Collections are intended to be used. And in deference to the work and internal discussions of the SeqCol team I am not suggesting that this feature become required or an exemplar case for Sequence Collections. However, I think it would be more inclusive of potential adopters with a similar view as my own to simply describe this feature and its use in the decision record, and tone down any language that may appear dismissive of this use case.
The text was updated successfully, but these errors were encountered: