-
-
Notifications
You must be signed in to change notification settings - Fork 390
Fix for Identify() failing on empty and small files: #319
Conversation
- Issue - Identify(,bytes.NewReader([]byte{})) returns an fmt.wrapError of io.EOF - Identify(,bytes.NewReader([]byte{'a'})) returns an fmt.wrapError of io.ErrUnexpectedEOF - the expected outcome is archiver.ErrNoMatch (i.e. not a compressed stream nor an archive) - Cause: lack of handling of io.EOF and io.ErrUnexpectedEOF outcomes of io.ReadFull() - Fix - consists in handling io.EOF and io.ErrUnexpectedEOF as cases of the stream not containing enough bytes - and returning the available bytes up to the requested size - @see archiver.head()
Hey, thanks, this looks very good, I appreciate the tests too. I will look at this more soon when I have a chance! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so first off, love the test cases. Thanks for writing those!
And you're right, this is a bug in each formats' Match()
method. I didn't think about the case where the stream is empty or short, and indeed, any error is currently returned, but we should probably ignore EOF or UnexpectedEOF.
I am not sure we need a separate function in its own file (head()
), for logic that's as simple as this:
if err != nil && !(err == io.ErrUnexpectedEOF || err == io.EOF) {
return nil, err
}
How about instead, we simply change the offending if err != nil
lines to:
if err != nil && !errors.Is(err, io.EOF) && !errors.Is(err, io.ErrUnexpectedEOF) {
return err
}
(using errors.Is
and distributing the !
is a theoretically robust and maybe easier to read)
Doing this should also simplify the now-awkward rar handling of this.
What do you think? Basically, just change the incorrect if
statements instead of making a new function. (I do like that you used :n
though when reading the buf slice; that's probably a good thing to keep too, even though I think ReadFull will always return n == len(buf)
if there's no error.)
Hi Matt, I am glad you loved them. I even added more to make some corner cases explicit. ** On extending <<if err != nil>> and keeping using io.ReadFull() ** On using errors.Is() ** On distributing <!>
** On << now-awkward rar handling >>
As alternative we could use a comparison between the length of buf and rarHeadV1_5 in computing mr.ByStream.
What do you think? Patrice |
Thanks for the detailed comment, Patrice, really appreciate it. I'm a bit swamped with a few other work things this week, plus am getting married, so I will circle back to this when I have a chance to give it the attention it deserves! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the liberty of cleaning up some of the code for consistency with the existing code base (mainly comments and minor style adjustments). Otherwise I think this change is good. Thank you for contributing it and helping me understand it! I will go ahead and merge this.
Hi Matt, That is great. Welcome back. |
Hi,
I had the following issue while identifying by content:
Version:
Expected Outcome:
This commit provides a fix, by properly handling io.EOF and io.ErrUnexpected. Please check archiver.head().
I have also try to rationalize the usage of io.ReadFull() vs. io.Reader.Read(). Hopefully this match your original rationale or is one that you can agree on.
Best,
Patrice