Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Header checksum is invalid #160

Closed
kriswd40 opened this issue Jan 9, 2017 · 19 comments
Closed

Header checksum is invalid #160

kriswd40 opened this issue Jan 9, 2017 · 19 comments
Labels
awaiting feedback needs-repro tar Related to TAR file format

Comments

@kriswd40
Copy link

kriswd40 commented Jan 9, 2017

Steps to reproduce

  1. I'm using the "Simple full extract from a Tar archive" example exactly as is.
  2. When I call "tarArchive.ExtractContents(destFolder);", I get the error "Header checksum is invalid".
  3. I'm using an archive that was created externally, so I don't have any control to change it. I can extract this archive just fine using 7-zip.

Expected behavior

File should extract without errors.

Actual behavior

"Header checksum is invalid" error is thrown.

Version of SharpZipLib

0.86.0

Obtained from (place an x between the brackets for all that apply)

  • Package installed using:
    • NuGet
@Lette
Copy link

Lette commented Jul 4, 2017

I just had the same problem.

For future Googlers: My problem turned out to be that I actually had a .tar.gz file which needed to be unzipped first. So the next example worked just fine! :-)

@piksel
Copy link
Member

piksel commented Sep 16, 2018

@kriswd40 If the input file is indeed a valid .tar (and not .tar.gz), there may be a bug here.

Could you upload a sample file that does not work?

@Lette Thanks for your input, that's probably helpful.

@bradleyasdf
Copy link

I got the same issue on a tar file. I untar the file using 7-zip, retar them using 7-zip Add to Archive, choose Archive Format as tar, set Compression Level as Storage, and untar it using ICharpCode. The error was gone.

@msvprogs
Copy link

msvprogs commented Jul 13, 2019

It seems that this error occurs when I'm trying to call GetNextEntry() after the stream has been fully read. The expected behavior is to return null, but actually it throws header checksum exception.
изображение

изображение

So the workaround would be like this:

while (true)
{
   TarEntry entry;
   try
   {
      entry = input.GetNextEntry();
      if (entry == null)
         break;
   }
   catch (TarException) when (input.Position >= input.Length)
   {
      break;
   }

   using (var fileStream = tempDirectory.Path.AppendSystem(entry.Name).ToFileInfo().OpenWrite())
      input.CopyEntryContents(fileStream);
}

@piksel
Copy link
Member

piksel commented Jul 14, 2019

Yeah, this seems to be related to unexpected data at the end of the input stream.

I could take a look at it, but someone needs to attach an archive that produces this problem :) You can upload it to github by just dragging it to the comment textbox.

@msvprogs
Copy link

Well, my archive is just too large 💦

@piksel
Copy link
Member

piksel commented Jul 29, 2019

@msvprogs Hm, what is the size of the .tar and the content when extracted? (you can use du -S PathToExtractedContent if you're on a *NIX/mac OS)

@bradleyasdf You don't happen to still have the "bad" file and be willing to upload it by any chance?

@msvprogs
Copy link

It's an OVA virtual machine, above 700 Mb, not compressed.

@piksel
Copy link
Member

piksel commented Jul 29, 2019

You should be able to extract the contents of the OVA like any other tar-file.
And preferably the size in bytes :) If I'm going to deduce something about what is going on with the file, I can at least compare the sizes and compare them to an estimation of what the sizes ought to be.

@msvprogs
Copy link

Okay, here it is :)

TAR file size: 735,303,680 bytes
TAR contents size:
изображение

@silviaPixowl
Copy link

Hi! I've been dealing with Header checksum invalid problem for a week and I don't know what else to try. I'm having issues only on IOS devices. I need to download a tar.gz compressed file, I'm using UnityWebRequest to download the file and I've tried to extract it using the Simple full extract from a TGZ (.tar.gz) example, and basically all other examples posted, but for some reason I keep getting this error when the GetNextEntry method is called either in ListContents or ExtractContents from inputTarArchive. Apparently on IOS when the header makes a ParseBuffer with the tarHeader received the checkSum obtained with the parseOctal method is different from the MakeCheckSum method and I can't figure out why, please I need help.

@piksel
Copy link
Member

piksel commented Aug 26, 2020

@silviaPixowl I actually think this is another issue. Could you make a new issue with the code you are using to extract, and if possible provide a sample file? Also, have you tried it in the simulator? (I don't have any iPhones at hand atm. to test on)

The original issue has to do with tar archives that don't stop after the last entry. If there is data left in the buffer and that data is not just null bytes, the library will try to read that data as a new entry, but since it's not actually another entry it just throws because the checksum is invalid. We should be able to handle this a bit more gracefully though.

The amount of bytes is correct for @msvprogs file (1,436,140 blocks á 512 byte = 735,303,680 byte, 1 block per file and one ending padding block), so it's not the length itself, but rather that the padding block isn't just \0 bytes for some reason.

@silviaPixowl
Copy link

@silviaPixowl I actually think this is another issue. Could you make a new issue with the code you are using to extract, and if possible provide a sample file? Also, have you tried it in the simulator? (I don't have any iPhones at hand atm. to test on)

The original issue has to do with tar archives that don't stop after the last entry. If there is data left in the buffer, the library will try to read that data as a new entry, but since it's not actually another entry it just throws because the checksum is invalid. We should be able to handle this a bit more gracefully though.

Thanks very much, I've created the issue #514

@piksel
Copy link
Member

piksel commented Nov 22, 2020

I did some experiments with this, but was not able to reproduce it (using OVAs created by VirtualBox). So the app creating those OVA files is probably something else. Whatever it is, I think it's not writing the end block(s), perhaps just leaving the original data in those blocks?

@msvprogs What software produced those files?

@piksel piksel added the tar Related to TAR file format label Nov 22, 2020
@RLashofRegas
Copy link

RLashofRegas commented Jul 21, 2021

@piksel I believe I have a minimal reproduction of this issue. For me it is not happening with a regular ".tar" file but with a ".tar.lz4". I am attempting to use the K4os.Compression.LZ4 library to inflate the file in stream and then SharpZipLib to unzip the tar. Code for this can be found here: https://gist.github.com/RLashofRegas/f572b7fc0a5a1b1a11cddc33be0ab2ae

I cannot attach the file here because GitHub does not support uploading lz4, however it is simple to create it. The tar consists of 4 text files (test1.txt, test2.txt, etc) each one is simply 11,468,800 bytes of the letter "a" all in a directory called "TestFiles". I tarred and lz4'd the directory using this port of 7-zip that supports lz4: https://github.com/mcmilk/7-Zip-zstd

Code throws "Header checksum is invalid" exception on call to GetNextEntry() after getting through the first two entries in the tar. 7-zip can deflate and extract it just fine. It may be important to note that if I first inflate it with lz4 and save the tar to an intermediate file it works to extract that tar.

If you think this is a separate issue happy to open one.

@piksel
Copy link
Member

piksel commented Jul 21, 2021

It may be important to note that if I first inflate it with lz4 and save the tar to an intermediate file it works to extract that tar.

Then try to diff the .tar with the LZ4Stream, since something is different about the streams. Are you using LZ4Stream.Decode to decompress in this scenario as well?

If it only happens when streaming from a 3rd-party library, I doubt the problem is inside TarInputStream itself. If it was the fourth file (the last file) it could be because of the same reason as OP ("garbage" data after last entry being treated as an invalid entry). I suspect we can't do much about it from our side though...

@RLashofRegas
Copy link

Thanks I think I determined the issue is not a SharpZipLib issue. Seems like the lz4 file created from 7-zip differs from that created using the K4os library. Still not sure why...

@piksel piksel closed this as completed Aug 8, 2021
@lweisberg
Copy link

31496.xlsx.zip

I came upon this thread while searching for a solution to the same exception (Header checksum illegal) that I am getting. I am attaching a zip file where, after reading it into a byte array (it is actually stored in a DB table in my application), on the first iteration of the code below, I get the exception on the call to Read(). Any help would be greatly appreciated. As you can see, the file opens fine using zip (I use 7-Zip)

using ICSharpCode.SharpZipLib.Zip.Compression.Streams;

public static byte[] DeCompress(byte[] bytInput)
{
	if (bytInput == null || bytInput.Length == 0)
		return null;
	Stream s2;
	using (var memoryStream = new MemoryStream(bytInput))
	{
		s2 = new InflaterInputStream(memoryStream);
		using (MemoryStream mem = new MemoryStream())
		{
			byte[] writeData = new byte[10240];
			while (true)
			{
				int size = s2.Read(writeData, 0, writeData.Length);
				if (size > 0)
				{
					mem.Write(writeData, 0, size);
				}
				else
				{
					break;
				}
			}

			mem.Close();
			return mem.ToArray();
		}
	}
}

@piksel
Copy link
Member

piksel commented Mar 10, 2022

@lweisberg The issue is a about tar archives, but I think you are trying to read a zip file using InflaterInputStream which expects raw gzip (deflate) data as it's input. A zipfile is not a simple gzip stream, but it's own format. Use ZipInputStream instead. You would also have to modify your code a bit, since a zip file contains multiple entries and can't just be decompressed to an output stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting feedback needs-repro tar Related to TAR file format
Projects
None yet
Development

No branches or pull requests

8 participants