-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unintuitive Behavior: Recovery Packets are 68 bytes larger than block size #159
Comments
Note: This may not be relevant unless someone is at the exact edge of what can be uploaded at once. 68 bytes does not mean much when working in 1k chunks, but at the extremes this does become an issue. |
It sounds like the quote you included matches up with the behaviour. The point of the recovery packet is not to fit in the space specified, it's to cater for damage to the size specified. For the Usenet application, the purpose is to cater for articles which go missing. If articles are 300KB long (pre-yEnc), then a 300KB block size is optimal, as one missing article only needs one recovery block. PAR2 isn't really designed for fitting within a given block size. You've found one case, but another would be other packets which must be in a PAR2 file, which don't have size/positioning guarantees. |
I was actually about to open that as a feature request. I personally don't need it for my niche application, since I have already written a basic Python par2 parser that aligns everything and uses Bin Packing and padding. However, it would be useful for other things. The downside of such a scheme is in the best case scenario, metadata size baloons to the minimum of chunk size, and I know larger par2 archives can have multiple copies of the metadata. That's not really a problem when working at smaller sizes like I am, but becomes an issue at larger recovery block sizes. Par2 has, obviously, worked well in its use case for years. However, at least noting the block size issue in the documentation and having some sort of alignment option may help with some of the use cases mentioned in the Readme. For example, storing parity information on DVDs and Blu-Rays is explicitly mentioned as an application. Having the option of creating a file where data falls neatly into a DVD's 2KiB sectors could aid in data recovery. Edit: First, I would like to confirm that par2cmdline will happily read files with padding added between packets. So, good job there! Second, you can actually think of my application as the same as the DVD example, but with far less data and a ludicrous amount of redundancy. It's just the whole 68 bytes thing is a "gotcha" until you read the spec. Which thank you, by the way. Compared to most specification documents I have read, the file format and associated cpp are extremely clear and easy to understand. |
I don't really know your specific application, but the example you give sounds a bit off to me. I see little reason why you'd specifically want the recovery blocks with metadata to be 2KB for your DVD example. Generally, you want the input block size to be 2KB, and hence the recovery block should be greater than 2KB. PAR2 probably isn't great with really small amounts of data. For one, 68 bytes out of 512 is quite some overhead. For another, there's a limit of 32768 input blocks, so if you're using a 444 byte block size, the most data you could feed PAR2 would be 13.875MB. |
Ouch. You are correct. It's just unfortunate that a tag is both needed, and makes alignment difficult. Personally, I feel that just giving up completely is also not the best approach though. The question is always data amount vs integrity, and I believe alignment can help. I'll look at doing a quick PR for something in the docs directory if that's okay. That way this isn't lost on anyone else attempting to use par2 like I am. |
I don't know your exact aim, but what you've indicated, what you want may be possible, but probably not through PAR2. For disclosure purposes, I don't maintain this repo or code base, so have no say over what will be accepted in terms of PRs. |
I agree this is an issue. We need to talk about an "internal block size", that Par2 uses for its calculations, and an "external block size", that the packets need to fit into. Our users probably care more about the external block size, than the internal. I'm still thinking about Parchive version 3 at the moment. We should mention this in the new spec. |
Hello,
This may not be a "bug," as the code is (probably) working correctly. However, the output does not perform as the documentation says it should.
While a recovery block is created with the size specified, the 64 byte packet header, plus 4 byte exponent. This means the created packet will not fit into a given amount of space unless those 68 bytes are taken into account.
Background
I have recently begun work on (yet another) system of backing up data via QR codes. As part of my research, I examined different ways of effectively storing metadata along with the ability to recover errors. This led me to both "tar" and this project. Tar stores data in 512 byte chunks. However, in attempting to make sure that my recovery data was as resilient as possible and everything was packed properly, I discovered that I had to set
-s444
instead of the expected 512!The text was updated successfully, but these errors were encountered: