Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix possible heap overflow in cram_encode_aux() on bad RG:Z tags #1737

Merged
merged 1 commit into from
Feb 2, 2024

Conversation

daviesrob
Copy link
Member

RG:Z tags without a proper NUL termination could lead to use of invalid data, or a heap overflow when the tag is passed to sam_hrecs_find_rg(), or hts_log_warning() if the former returns NULL. Fix by moving the line that skips to the end of the aux tag and then checking that it was terminated correctly. Should it not be, the aux parser is reset so the tag can be stored verbatim (the code that does that already handles badly-terminated Z tags).

Credit to OSS-Fuzz
Fixes oss-fuzz 66369

@jkbonfield
Copy link
Contributor

The change looks good, but I'm not sure it's sufficient.

It fixes the fuzzing problem, but it still generates a broken aux record as it doesn't add the nul termination when storing the RG verbatim. This then means we convert from BAM to CRAM without an error (just a warning), but produce a CRAM which cannot be decoded:

$ ./test/test_view /tmp/_.cram 
[E::sam_format1_append] Corrupted aux data for read   
Error writing output.

I'm tempted to say we should just short-cut the issue by having a hard fail. If aux is detected to be malformed, don't attempt to continue.

@daviesrob
Copy link
Member Author

I can easily switch that to a hard fail, but should the same also be done for other Z-type tags? At the moment, they will also presumably make broken cram files.

@jkbonfield
Copy link
Contributor

Yes I guess it's something that is a general issue. So maybe it's a separate problem.

@daviesrob
Copy link
Member Author

I've updated to make it fail on the bad data, and also added similar checks to the other places that handled Z and H type tags.

@@ -3093,6 +3105,11 @@ static sam_hrec_rg_t *cram_encode_aux(cram_fd *fd, bam_seq_t *b,
aux += 3;
aux_s = aux;
while (aux < aux_end && *aux++);
if (aux == aux_end && aux[-1] != '\0') {
hts_log_error("Unterminated RG:Z tag for read \"%s\"",
Copy link
Contributor

@jkbonfield jkbonfield Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect error message here, but otherwise looks good.

Maybe just "Unterminated Z or H tag ..."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, that's what you get for cutting and pasting...

RG:Z tags without a proper NUL termination could lead to use of
invalid data, or a heap overflow when the tag is passed to
sam_hrecs_find_rg(), or hts_log_warning() if the former returns
NULL.  Fix by moving the line that skips to the end of the aux
tag and then checking that it was terminated correctly, failing
if it was not.

Similar checks are also added for MD:Z and generic Z- or H- type
tags, to prevent generation of unreadable CRAM files.

Credit to OSS-Fuzz
Fixes oss-fuzz 66369
@jkbonfield jkbonfield merged commit 7278dab into samtools:develop Feb 2, 2024
9 checks passed
@jkbonfield
Copy link
Contributor

Thanks. Seems to do the trick :)

@daviesrob daviesrob deleted the runaway_aux_string2 branch February 2, 2024 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants