-
-
Notifications
You must be signed in to change notification settings - Fork 32.4k
Open
Labels
stdlibPython modules in the Lib dirPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error
Description
Bug report
Bug description:
Python decodes the bytes 0x8FA2A7 as ~ (TILDE) in EUC-JP.
assert b'\x8f\xa2\xb7'.decode('euc_jp') == '~'
This reference document is ambiguous in that it shows a simple ~ (TILDE), but most other software (iconv, Vim, Firefox, Rust's encoding_rs) interpret this as ~ (FULLWIDTH TILDE). Note that EUC-JP already includes US-ASCII, and so:
assert '~'.encode('euc-jp') == b'~'
CPython versions tested on:
3.11, CPython main branch
Operating systems tested on:
Linux
Linked PRs
Metadata
Metadata
Assignees
Labels
stdlibPython modules in the Lib dirPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error
Projects
Status
No status