Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/7.0] backport Tar fixes #76322

Merged
merged 12 commits into from
Oct 4, 2022

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Sep 28, 2022

Backport of #75902 to release/7.0

/cc @carlossanlop @jozkee

Customer Impact

Issue #75921:

Customers who attempt to use old (but most compatible) formats such as v7 and ustar and they attempt to archive a filename which exceeds the maximum allowed length of the format, will find that those names were truncated at the limit. Hence, reading of the archived entries won't match what they attempted to write.

Issue #75482:

For all formats, except Pax, we were using ASCII encoding to write out filenames and other text fields, this of course caused that non-ascii names, such as földër, would end-up garbled. When compared with tools that perform tar archiving, they were able to handle UTF8 names properly.

Issue #75360:

ustar format introduced an extra field to extend max path length to 256 with the caveat that the path must be correctly split on any path separator to fit in the previous field Name (100 bytes) and the new Prefix field (155). This required a convoluted logic that we were not properly following. We tackled this issue as part of the same PR since this code was tightly coupled to the previous two issues.

Issue #75215:

When you attempted to copy a tar, without extracting the original tar, you ended up with the following exception:

Unhandled exception. System.ArgumentException: An item with the same key has already been added. Key: path
   at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
   at System.Collections.Generic.Dictionary`2.Add(TKey key, TValue value)
   at System.Formats.Tar.TarHeader.CollectExtendedAttributesFromStandardFieldsIfNeeded()

This seems like a very simple scenario that should be supported and the error has been hit by me and another MSFT dev, which leads me to believe that it will be hit by external users as well.

Testing

Added related tests to these scenarios with the following dimensions:
[un]seekable streams,
[a]sync,
UTF8 characters that occupy two bytes of space, at the edge of the name (to check truncation)
UTF32 characters that occupy four bytes, (emojis), at the edge of the name (to check truncation)

For ustar convoluted path split logic, tested with:
absolute/relative paths,
paths as small as 1, as big as 256,
paths that are unable to split at Prefix (155)

For all issues listed above, I added roundtripping tests i.e: ensure we read and write correctly what we meant to write.

Also, I did several comparisons with other tar tools: GNU tar (ubuntu), BSD tar (Windows), and 7zip.

Risk

Relatively low, this is a new feature, changes won't negatively impact release and they are important fixes to silent errors issues.

if the fixes here happen to break something, it would be better that way vs the current silent errors.

@ghost
Copy link

ghost commented Sep 28, 2022

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Backport of #75902 to release/7.0

/cc @carlossanlop @jozkee

Customer Impact

Testing

Risk

IMPORTANT: Is this backport for a servicing release? If so and this change touches code that ships in a NuGet package, please make certain that you have added any necessary package authoring and gotten it explicitly reviewed.

Author: github-actions[bot]
Assignees: -
Labels:

area-System.IO

Milestone: -

Copy link
Member

@carlossanlop carlossanlop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signing off.

@jozkee can you please fill out the template, add the servicing-consider label, then send an email to Tactics requesting approval?

@jozkee jozkee changed the title [release/7.0] Use UTF8 encoding on Tar string fields [release/7.0] backport Tar fixes Oct 3, 2022
…ry (#76404)

* Use indexer setter instead of Add on ExtendedAttributes dictionary

* Add roundtrip tests

* Fix TryAddStringField and always set mtime
@jozkee jozkee self-assigned this Oct 3, 2022
@jozkee jozkee added this to the 7.0.0 milestone Oct 3, 2022
@jozkee jozkee added the Servicing-consider Issue for next servicing release review label Oct 3, 2022
Copy link
Member

@jeffhandley jeffhandley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. All of these issues meet the bar for issues we would fix in servicing, so I support this going into GA.

@rbhanda rbhanda added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Oct 4, 2022
@carlossanlop
Copy link
Member

CI is green. Approved by Tactics. Signed off. Ready to merge. :shipit:

@carlossanlop carlossanlop merged commit 72fcf2e into release/7.0 Oct 4, 2022
@carlossanlop carlossanlop deleted the backport/pr-75902-to-release/7.0 branch October 4, 2022 19:54
@ghost ghost locked as resolved and limited conversation to collaborators Nov 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Formats.Tar Servicing-approved Approved for servicing release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants