[Breaking change]: ZipArchiveEntry names and comments now respect UTF8 flag when decoded #42003
Closed
1 of 3 tasks
Labels
breaking-change
Indicates a .NET Core breaking change
doc-idea
Indicates issues that are suggestions for new topics [org][type][category]
in-pr
This issue will be closed (fixed) by an active pull request.
Pri1
High priority, do before Pri2 and Pri3
📌 seQUESTered
Identifies that an issue has been imported into Quest.
Description
Relates to dotnet/runtime#103271.
A ZipArchive can be created with an Encoding parameter, which is used to decode the names and comments of entries in the ZIP archive. .NET 7 and 8 introduced a regression where this encoding was used by default, with a fallback to the system default code page (UTF8 in .NET Core) if no encoding was supplied. This regression is being corrected in .NET 9: if the entry's general purpose bit flags indicate that UTF8 should be used, this will be respected, the user-supplied encoding will be used (with the existing fallback to the system default code page if none is supplied.)
I've stated that .NET 9 RC 1 introduced this change - the PR hasn't yet been merged (it's pending this work) so I've selected the next known release. It'll definitely be in .NET 9.
Version
.NET 9 RC 1
Previous behavior
If ZipArchive was instantiated with a user-specified
entryNameEncoding
parameter, this encoding would always be used when decoding the names and comments of entries in the ZIP archive (even if the entry had the bit set to signify that its name and comment were encoded in UTF8.)New behavior
When a ZIP archive entry's name and comment are being decoded, its UTF8 bit flag will be respected. The user-supplied
entryNameEncoding
parameter will only be used to decode the entry's name and comment if this bit flag is unset.Type of breaking change
Reason for change
This corrects a regression in .NET 7 and .NET 8 (reported in dotnet/runtime#92283). It also returns ZipArchive to compliance with the ZIP file format specification, sections 4.4.4 and appendix D.
Section 4.4.4:
Appendix D:
Recommended action
Users passing an encoding to the ZipArchive constructor should be aware that this will not be respected in all situations. It will only be used if the entry's UTF8 bit is not set.
Users who are using ZipArchive to parse ZIP entries with names encoded in non-UTF8 format (but which have the UTF8 bit flag set) will no longer be able to do so. This was always a bug.
Feature area
Core .NET libraries
Affected APIs
Associated WorkItem - 292500
The text was updated successfully, but these errors were encountered: