-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
validation improvements, fixes #301 #331
base: dev-0.6
Are you sure you want to change the base?
Conversation
Was:
Became:
|
941897b
to
5d913fd
Compare
added a new commit to fix aac panics. refactored aac aproximate_frame_count() method. It had some issues which causes panic. Should fix cases from https://github.com/qarmin/Automated-Fuzzer/actions/runs/12516290494 (REPORTS___SYMPHONIA*)
I haven't tested on VBR aac to observe any improvements, It was mention that aproximate method is not perfect there @dedobbin please review. |
f970b72
to
2c2a8e4
Compare
fixed elst atom entry count sizes |
Added commit: core: advance reader's position if pattern cannot be found This addresses out-of-memory errors mentioned in the following comment: Issue #301 Comment. Issue description:
This commit ensures that the reader's position is always advanced, resolving the issue. |
Added commit: convert FourCc assertions to decode_error This addresses some panic errors mentioned in the following comment: #301 (comment). |
Added commit: mkv: report decode error on invalid vint width This addresses some panic errors mentioned in the following comment: #301 (comment). |
Happy new year, @sscobici! I'll be back to working on Symphonia in a few days, however, I wanted to drop in here to give you a heads up since you're touching a lot of files. I have a WIP change that rewrites MKV's EBML iterator entirely. The previous code was not fully compliant with the spec, but also had many fundamental issues. So please avoid modifying the MKV reader as any changes there will become a conflict. While working on those changes, I also realized that MP4, RIFF, and MKV are all roughly the same (i.e., nested chunk based structure) and thus all had roughly the same problems (e.g., reading out of bounds, etc.). Therefore, I'd like to apply a similar approach as the new EBML iterator to MP4's Atom iterator, and RIFF's Chunk iterator. In this PR I noticed that you've added size checks to some atom. I'm not very fond of this solution because it's not a very scalable solution, and is difficult for parent atoms. Also, it involves a lot of "magic numbers". I think the new EBML iterator provides a good solution to this problem that's more general. So, I'd also recommend avoiding making further changes like this as well. The new EBML iterator requires some additional polishing before I'm ready to merge it, but I just wanted to give you this heads up. |
ee647d2
to
d3cd3fd
Compare
Added commit: aac: add validation for Ics.info.max_sfb, fixes out of range panic during decoding. |
d3cd3fd
to
8a693db
Compare
added commits:
|
Checking in since its been more than a few days. I'm almost ready to push those changes mentioned, however I still have one more issue to solve before I can do that. |
8a693db
to
d8fb220
Compare
Added: metadata: validate min chunk length for jpeg Feedback - error message from the new ebml iterator is very generic and it would be good if this can be improved (reader position offset, element name (path)). But this is for future. |
d8fb220
to
4e5795f
Compare
added: mkv: detect parent overrun after the header parsing |
Regarding the latest commit about overrun in mkv, should this ever be allowed? What is the overall strategy for demuxer? Validation of correctness or max effort for packet and format extraction? When it should error, when it should warn and skip or try to recover? |
4e5795f
to
6788d7c
Compare
added: alac: validate mid_side_shift to not overflow |
Hey, let's start trying to get some of these fixes merged in. I think we will need to split this PR. My recommendation would be to do one PR per major component (i.e., core, riff, aac, etc.) while holding off on the MP4 changes for now. I can leave more pointed feedback on those smaller PRs.
Nice find. The EBML spec is a bit open-ended about this. However, I think this is just a special case of checking if the child element exceeds the parent element. In this case the EBML iterator just skips to the end of the parent element. However, the difference in this case is that we already consumed bytes to read the header and so we would need to rewind the reader to undo those reads. So, I have two recommendations: First, you can handle the overrun like this instead: if header.data_pos() > parent_end {
self.reader.seek_buffered(header.pos());
log::debug!("element out-of-bounds, ignoring");
return Ok(None);
} This rewinds back to the start of the header, and then ends the iteration of the parent element. Second, add a third condition here to check if Symphonia/symphonia-format-mkv/src/ebml.rs Lines 484 to 496 in 5a9bc17
The reference implementation does not clamp the value, nor does it consider it an error. Yay, undefined behaviour! However, to prevent a panic, we should clamp it to a max of 31. |
regarding mp4:
I am unsure whether Would it be beneficial for all overruns to first check whether seeking back is possible before applying Ideally, overruns should not occur if |
I think you are referring to MKV for the message..
They should be equal here since the reader should be positioned right after the element header, and the data position is always after the header. If you'd prefer to use
The overrun on line 494 is one that would have had to occur when the iterator was in the control of the caller. (i.e., they read too much before calling I'm pretty certain this scenario can only happen due to programmer error and not due to invalid data in the file.
Yes, it'll always succeed.
I'm torn, because, on the one hand, whatever was read and caused the overrun likely read bad data. On the other hand, if the overrun on line 494 can be resolved by using I'm leaning towards keeping it a hard error for now. |
6788d7c
to
bb528d9
Compare
Added More Validations for isomp4 to prevent some panics, Fixes #301, #300
Note: This update does not resolve all potential panic scenarios or address issues related to high memory consumption.
Requires a full regression run on as many files as possible to revalidate the constants put for validations.