Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shot at bringing draft spec up to date with adrs #44

Merged
merged 7 commits into from
Aug 22, 2023
Merged

Conversation

nsheff
Copy link
Member

@nsheff nsheff commented Apr 13, 2023

Far from final, but this at least brings a draft specification up-to-date with our current decisions. Take a look if you can but the goal here is to provide a general, one-stop description of seqcol that could be useful to have during Connect.

I do still need to get the 'inherent' stuff in there, though.

TODO:

@nsheff nsheff requested review from andrewyatz and tcezard April 13, 2023 12:59
docs/specification.md Outdated Show resolved Hide resolved
docs/specification.md Outdated Show resolved Hide resolved
docs/specification.md Outdated Show resolved Hide resolved
docs/specification.md Outdated Show resolved Hide resolved
@nsheff nsheff changed the base branch from master to dev August 22, 2023 14:48
@nsheff nsheff merged commit 2e96056 into dev Aug 22, 2023
@nsheff
Copy link
Member Author

nsheff commented Aug 22, 2023

I will merge this to dev to make it easier to see and review this.

@nsheff nsheff deleted the spec_rewrite branch August 22, 2023 14:49
Comment on lines +160 to +166
The GA4GH digest algorithm, `sha512t24u`, was created as part of the [Variation Representation Specification standard](https://vrs.ga4gh.org/en/stable/impl-guide/computed_identifiers.html). This procedure is described as ([Hart _et al_. 2020](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0239883)):

- performing a SHA-512 digest on a binary blob of data
- truncate the resulting digest to 24 bytes
- encodes the 24 bytes using `base64url` ([RFC 4648](https://datatracker.ietf.org/doc/html/rfc4648#section-5)) resulting in a 32 character string

This converts the value of each attribute in the seqcol into a digest string. Applying this to each value will produce a structure that looks like this:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should get its own section so it can be refered to elsewhere like in the sorted_name_length_pairs section

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. would it make sense in a footnote, maybe?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants