Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-116608: Ignore UTF-16 BOM in importlib.resources._functional tests #117569

Merged
merged 1 commit into from
Apr 5, 2024

Conversation

encukou
Copy link
Member

@encukou encukou commented Apr 5, 2024

To test the errors argument, we read a UTF-16 file as UTF-8 with "backslashreplace" error handling. However, the utf-16 codec adds an endian-specific byte-order mark, so on big-endian machines the expectation doesn't match the test file (which was saved on a little-endian machine).

Use endswith to ignore the BOM.

… tests

To test the `errors` argument, we read a UTF-16 file as UTF-8
with "backslashreplace" error handling. However, the utf-16
codec adds an endian-specific byte-order mark, so on big-endian
machines the expectation doesn't match the test file (which was
saved on a little-endian machine).

Use endswith to ignore the BOM.
@encukou encukou requested review from jaraco, warsaw and FFY00 as code owners April 5, 2024 13:46
@bedevere-app bedevere-app bot added tests Tests in the Lib/test dir awaiting core review labels Apr 5, 2024
@encukou encukou changed the title gh-116609: Ignore UTF-16 BOM in importlib.resources._functional tests gh-116608: Ignore UTF-16 BOM in importlib.resources._functional tests Apr 5, 2024
@encukou
Copy link
Member Author

encukou commented Apr 5, 2024

!buildbot s390x

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @encukou for commit 26ae210 🤖

The command will test the builders whose names match following regular expression: s390x

The builders matched are:

  • s390x Fedora Rawhide Clang Installed PR
  • s390x Fedora Rawhide Clang PR
  • s390x Fedora LTO PR
  • s390x Fedora Refleaks PR
  • s390x RHEL7 LTO + PGO PR
  • s390x Fedora LTO + PGO PR
  • s390x Fedora Clang PR
  • s390x Fedora PR
  • s390x Fedora Rawhide LTO PR
  • s390x Fedora Rawhide PR
  • s390x RHEL8 LTO PR
  • s390x Fedora Rawhide Refleaks PR
  • s390x Fedora Clang Installed PR
  • s390x RHEL8 PR
  • s390x Fedora Rawhide LTO + PGO PR
  • s390x RHEL8 Refleaks PR
  • s390x RHEL8 LTO + PGO PR
  • s390x RHEL7 PR
  • s390x RHEL7 LTO PR
  • s390x RHEL7 Refleaks PR
  • s390x SLES PR
  • s390x Debian PR

@encukou encukou merged commit 4d4a6f1 into python:main Apr 5, 2024
47 of 50 checks passed
@encukou encukou deleted the importlib-tests-be branch April 5, 2024 15:00
@zooba
Copy link
Member

zooba commented Apr 8, 2024

@encukou Out of interest, was the endswith necessary? I thought using utf-16-le would strip the BOM automatically, and the issue you were hitting is that utf-16-be (implied by utf-16 on BE machines) was rejecting it. Explicitly specifying -le should have worked, I'd thought.

diegorusso pushed a commit to diegorusso/cpython that referenced this pull request Apr 17, 2024
… tests (pythonGH-117569)

pythongh-116609: Ignore UTF-16 BOM in importlib.resources._functional tests

To test the `errors` argument, we read a UTF-16 file as UTF-8
with "backslashreplace" error handling. However, the utf-16
codec adds an endian-specific byte-order mark, so on big-endian
machines the expectation doesn't match the test file (which was
saved on a little-endian machine).

Use endswith to ignore the BOM.
@jaraco
Copy link
Member

jaraco commented Aug 14, 2024

This change needs to be applied to importlib_resources. It looks like a related issue was reported in python/importlib_resources#312.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
skip news tests Tests in the Lib/test dir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants