-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/7.0] backport Tar fixes #76322
Conversation
Tagging subscribers to this area: @dotnet/area-system-io Issue DetailsBackport of #75902 to release/7.0 /cc @carlossanlop @jozkee Customer ImpactTestingRiskIMPORTANT: Is this backport for a servicing release? If so and this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signing off.
@jozkee can you please fill out the template, add the servicing-consider
label, then send an email to Tactics requesting approval?
…ry (#76404) * Use indexer setter instead of Add on ExtendedAttributes dictionary * Add roundtrip tests * Fix TryAddStringField and always set mtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. All of these issues meet the bar for issues we would fix in servicing, so I support this going into GA.
CI is green. Approved by Tactics. Signed off. Ready to merge. |
Backport of #75902 to release/7.0
/cc @carlossanlop @jozkee
Customer Impact
Issue #75921:
Customers who attempt to use old (but most compatible) formats such as v7 and ustar and they attempt to archive a filename which exceeds the maximum allowed length of the format, will find that those names were truncated at the limit. Hence, reading of the archived entries won't match what they attempted to write.
Issue #75482:
For all formats, except Pax, we were using ASCII encoding to write out filenames and other text fields, this of course caused that non-ascii names, such as földër, would end-up garbled. When compared with tools that perform tar archiving, they were able to handle UTF8 names properly.
Issue #75360:
ustar format introduced an extra field to extend max path length to 256 with the caveat that the path must be correctly split on any path separator to fit in the previous field Name (100 bytes) and the new Prefix field (155). This required a convoluted logic that we were not properly following. We tackled this issue as part of the same PR since this code was tightly coupled to the previous two issues.
Issue #75215:
When you attempted to copy a tar, without extracting the original tar, you ended up with the following exception:
This seems like a very simple scenario that should be supported and the error has been hit by me and another MSFT dev, which leads me to believe that it will be hit by external users as well.
Testing
Added related tests to these scenarios with the following dimensions:
[un]seekable streams,
[a]sync,
UTF8 characters that occupy two bytes of space, at the edge of the name (to check truncation)
UTF32 characters that occupy four bytes, (emojis), at the edge of the name (to check truncation)
For ustar convoluted path split logic, tested with:
absolute/relative paths,
paths as small as 1, as big as 256,
paths that are unable to split at Prefix (155)
For all issues listed above, I added roundtripping tests i.e: ensure we read and write correctly what we meant to write.
Also, I did several comparisons with other tar tools: GNU tar (ubuntu), BSD tar (Windows), and 7zip.
Risk
Relatively low, this is a new feature, changes won't negatively impact release and they are important fixes to silent errors issues.
if the fixes here happen to break something, it would be better that way vs the current silent errors.