Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRAM validation / test files #512

Merged
merged 1 commit into from
Apr 29, 2021
Merged

Conversation

jkbonfield
Copy link
Contributor

The files are numbered such that a natural sort order will mean the earlier files may be decodable without yet being able to decode any of the later files. Therefore this may be considered as a roadmap for development of new CRAM decoders.

Not yet complete, but obviously better than what was there before. :-)
I made the PR so people can provide feedback while this is progressing.

@jkbonfield jkbonfield force-pushed the CRAM_tests branch 4 times, most recently from 600f8ca to 461f136 Compare June 29, 2020 13:28
@jkbonfield jkbonfield changed the title WIP: Initial creation of CRAM validation / test files CRAM validation / test files Jun 29, 2020
@jkbonfield jkbonfield marked this pull request as ready for review June 29, 2020 13:42
@jkbonfield jkbonfield force-pushed the CRAM_tests branch 5 times, most recently from 4e60e79 to da888d6 Compare June 29, 2020 14:08
The files are numbered such that a natural sort order will mean the
earlier files may be decodable without yet being able to decode any of
the later files.  Therefore this may optionally be considered as a
roadmap for development of new CRAM decoders.

Indices are also tested.
@jkbonfield
Copy link
Contributor Author

jkbonfield commented Jun 29, 2020

Finally beaten the .md file into some reasonable state.

I think I'm happy with this as a starting point for CRAM 3.0. CRAM 3.1 would just be a few new codecs included (which are tested outside of CRAM in their own repository already), so it'd be easy enough to add them when we're ready. What I don't have are failure tests, but they're not as vital as the "passed" ones.

I've included pre-made CRAMs and the samtools decoded versions of their SAM files. We probably want something similar for BAM too, but I'm all worn out in specs testing land atm.

You can see the formatted documentation here:

https://github.com/jkbonfield/hts-specs/blob/CRAM_tests/test/cram/3.0/CRAM.md

@tskir
Copy link
Member

tskir commented Apr 1, 2021

As discussed on the File Formats call 2021-04-01 between @jmarshall, @lbergelson, @tcezard and @tskir, this PR will be merged in its current form in four weeks (2021-04-29), unless there are strong objections.

@tskir tskir merged commit f7c9240 into samtools:master Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants