Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipfile.Path.read_text & .open methods with a positional encoding arg causes a TypeError #101144

Closed
1 task done
gpshead opened this issue Jan 19, 2023 · 6 comments
Closed
1 task done
Assignees
Labels
3.10 only security fixes 3.11 only security fixes triaged The issue has been accepted as valid by a triager. type-bug An unexpected behavior, bug, or error

Comments

@gpshead
Copy link
Member

gpshead commented Jan 19, 2023

This is a regression from 3.9 behavior seen in Python 3.10.

zipfile.Path.read_text passes *args and **kwargs to it's own open() method, but in 3.10+ is also blindly sets kwargs["encodings"] to a value. So code that was previously passing an encoding as a positional argument now sees:

  File "/<stdlib>/zipfile.py", line 2362, in read_text
    with self.open('r', *args, **kwargs) as strm:
  File "/<stdlib>/zipfile.py", line 2350, in open
    return io.TextIOWrapper(stream, *args, **kwargs)
TypeError: Failed to construct dataset imagenet2012: argument for TextIOWrapper() given by name ('encoding') and position (2)

3.10's Lib/zipfile.py (and main's Lib/zipfile/_path.py) contain:

    def read_text(self, *args, **kwargs):
        kwargs["encoding"] = io.text_encoding(kwargs.get("encoding"))  # <-- new in 3.10, source of bug
        with self.open('r', *args, **kwargs) as strm:
            return strm.read()

As this is a regression, we should avoid setting that value in kwargs when "encodings" not in kwargs to match 3.9 and earlier behavior.

TODO list:

Linked PRs

@gpshead gpshead added type-bug An unexpected behavior, bug, or error needs backport to 3.10 only security fixes 3.11 only security fixes 3.10 only security fixes needs backport to 3.11 only security fixes labels Jan 19, 2023
@gpshead
Copy link
Member Author

gpshead commented Jan 19, 2023

(and note that 3.10 is probably closed to bugfixes)

@gpshead
Copy link
Member Author

gpshead commented Jan 19, 2023

Workaround for users, always pass encoding to read_text as a keyword encoding= argument rather than the first positional argument.

@gpshead gpshead changed the title zipfile.Path.read_text with a positional encoding arg causes argument for TextIOWrapper() given by name ('encoding') and position (2) zipfile.Path read_text and open methods with positional encoding arg causes TypeError argument given by name ('encoding') and position (2) Jan 19, 2023
gpshead added a commit to gpshead/cpython that referenced this issue Jan 19, 2023
As was the behavior in 3.9 and earlier.  The fix for
python#87817 introduced an API
regression in 3.10.0b1.
@gpshead gpshead added triaged The issue has been accepted as valid by a triager. and removed needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes labels Jan 19, 2023
@gpshead gpshead changed the title zipfile.Path read_text and open methods with positional encoding arg causes TypeError argument given by name ('encoding') and position (2) zipfile.Path.read_text & .open methods with a positional encoding arg causes a TypeError Jan 19, 2023
@jaraco
Copy link
Member

jaraco commented Jan 19, 2023

One thought that I keep having - although this incompatibility was introduced unintentionally, it seems not to have affected but one reported use-case (and probably a handful that were not reported). Perhaps it makes sense to retain the more constrained form. As I consider that, I realize that these wrappers are really trying to be true to the underlying interface, so it's probably preferable to honor that and remain consistent than to offer only a more constrained interface. In other words, the proposed fix sounds good.

@jaraco
Copy link
Member

jaraco commented Jan 19, 2023

  • Port the change to zipp.

Feel free to assign to me after the PR lands and I can handle the port. Or if you're interested in learning the process, I can share the process with you.

gpshead added a commit that referenced this issue Jan 20, 2023
The zipfile.Path open() and read_text() encoding parameter can be supplied as a positional argument without causing a TypeError again. 3.10.0b1 included a regression that made it keyword only.

Documentation update included as users writing code to be compatible with a wide range of versions will need to consider this for some time.
gpshead added a commit to gpshead/cpython that referenced this issue Jan 20, 2023
…ional. (pythonGH-101145)

The zipfile.Path open() and read_text() encoding parameter can be supplied as a positional argument without causing a TypeError again. 3.10.0b1 included a regression that made it keyword only.

Documentation update included as users writing code to be compatible with a wide range of versions will need to consider this for some time..
(cherry picked from commit 5927013)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
gpshead added a commit that referenced this issue Jan 20, 2023
…e positional (#101179)

The zipfile.Path open() and read_text() encoding parameter can be supplied as a positional argument without causing a TypeError again. 3.10.0b1 included a regression that made it keyword only.

Documentation update included as users writing code to be compatible with a wide range of versions will need to consider this for some time..

(cherry picked from commit 5927013)

Co-authored-by: Gregory P. Smith <greg@krypto.org> [Google]
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Jan 20, 2023
…g to be positional (pythonGH-101179)

The zipfile.Path open() and read_text() encoding parameter can be supplied as a positional argument without causing a TypeError again. 3.10.0b1 included a regression that made it keyword only.

Documentation update included as users writing code to be compatible with a wide range of versions will need to consider this for some time..

(cherry picked from commit 5927013)

(cherry picked from commit efe3a38)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Gregory P. Smith <greg@krypto.org> [Google]
@gpshead
Copy link
Member Author

gpshead commented Jan 20, 2023

Perhaps it makes sense to retain the more constrained form. As I consider that, I realize that these wrappers are really trying to be true to the underlying interface, so it's probably preferable to honor that

yep, I went through the same thought process. They're mirroring other APIs that take encoding and those accept it as a positional parameter, so fixing it rather than just declaring it keyword only made sense as a bugfix.

As for your zipp project @jaraco, I'll let you apply the change there. Keeping this open and reassigning to you for that. The 3.10 PR will probably automerge, if not one of us can push the button when CI is done.

miss-islington added a commit that referenced this issue Jan 20, 2023
…e positional (GH-101179) (GH-101182)

The zipfile.Path open() and read_text() encoding parameter can be supplied as a positional argument without causing a TypeError again. 3.10.0b1 included a regression that made it keyword only.

Documentation update included as users writing code to be compatible with a wide range of versions will need to consider this for some time..

(cherry picked from commit 5927013)
(cherry picked from commit efe3a38)

Co-authored-by: Gregory P. Smith <greg@krypto.org> [Google]

Automerge-Triggered-By: GH:gpshead
jaraco pushed a commit to jaraco/zipp that referenced this issue Jan 27, 2023
…onal. (python/cpython#101145)

The zipfile.Path open() and read_text() encoding parameter can be supplied as a positional argument without causing a TypeError again. 3.10.0b1 included a regression that made it keyword only.

Documentation update included as users writing code to be compatible with a wide range of versions will need to consider this for some time.
jaraco added a commit to jaraco/zipp that referenced this issue Jan 27, 2023
jaraco pushed a commit to jaraco/zipp that referenced this issue Jan 27, 2023
…onal. (python/cpython#101145)

The zipfile.Path open() and read_text() encoding parameter can be supplied as a positional argument without causing a TypeError again. 3.10.0b1 included a regression that made it keyword only.

Documentation update included as users writing code to be compatible with a wide range of versions will need to consider this for some time.
jaraco added a commit to jaraco/zipp that referenced this issue Jan 27, 2023
jaraco added a commit to jaraco/zipp that referenced this issue Jan 27, 2023
@jaraco
Copy link
Member

jaraco commented Jan 27, 2023

The backport ended up being fairly challenging, given the lack of sys.flags.warn_default_encoding and EncodingWarning on Python prior to 3.10, but also the nitpickiness of mypy and flake8. I could satisfy flake8 by monkeypatching builtins and sys.flags, but I couldn't satisfy mypy because it has the attributes of sys.flags hard-coded somewhere. Ultimately, I came up with something that's suitable and pretty true to the same spirit in jaraco/zipp#89 released as zipp 3.12 (the relationship to Python 3.12 is purely coincidental). The one downside is that the check for test_encoding_warnings is now skipped except in environments where the value is set. I'm assuming that the test suite is regularly tested with -X warn_default_encoding, so the test will catch a regression soon enough.

I plan to add PYTHONWARNDEFAULTENCODING to skeleton so that the projects I maintain get the warnings.

@jaraco jaraco closed this as completed Jan 27, 2023
clrpackages pushed a commit to clearlinux-pkgs/pypi-zipp that referenced this issue Jan 30, 2023
…n 3.12.0

Gregory P. Smith (1):
      python/cpython#101144: Allow open and read_text encoding to be positional. (python/cpython#101145)

Jason R. Coombs (13):
      Honor ResourceWarnings. Fixes jaraco/skeleton#73.
      tox 4 requires a boolean value, so use '1' to FORCE_COLOR. Fixes jaraco/skeleton#74.
      Remove unnecessary shebang and encoding header in docs conf.
      Prevent Python 3.12 from blocking checks.
      Build docs in CI, including sphinx-lint.
      Put tidelift docs dependency in its own section to limit merge conflicts.
      Update badge for 2023
      Update changelog. Ref python/cpython#101144.
      Invoke test_encoding_warnings in-process, but skip when warn_default_encoding is not set. Re-use alpharep fixture for the file.
      Provide 'sys.flags.warn_default_encoding' for the tests to skip prior to 3.10.
      Due to mypy, it's not possible to patch the value, so just be lenient in access.
      Prefer simple asserts
      Replace trailing comment with a comment on a separate line.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only security fixes 3.11 only security fixes triaged The issue has been accepted as valid by a triager. type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants