-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
od -c => od -tc: od -c is a compat-only XSI extension ~equivalent to LC_CTYPE=C od -tc and not universally available #2922
Conversation
…d -tc and not universally available
If all checks pass then LGTM. |
It does not look good to me. The title is just outright wrong. You can say that Also calling |
Actually that is not true either. https://pubs.opengroup.org/onlinepubs/9699919799/utilities/od.html WIth Both use the |
If you really want to want to replace Do you actually have a system that has got But let's not misleadingly bringing up If the only reason why we are making this change is because |
Are they the same or is POSIX wrong or are implementations wrong? The spec says they aren't the same because they backslashise a different set of characters. This is neither here nor there really. POSIX is wrong: no implementation of -c has ever interpreted the data as characters. Confer http://ro.ws.co.ls/od#Standards. Or prowl through historical implementations yourself! Your pick. In this case the XSI text says "characters" because this is what the sysvr4 manual says: Some may also recognise the fact that previous sysv manuals also said the same thing, and those definitely weren't localised (don't worry, the sysvr4 od implementation isn't either; IIRC it tried to be but it's broken). Some would say this is because they're using "characters" to mean bytes because they're yanks and multi-byte encodings weren't quite in vogue yet. Compare the sysiii manual, which says the same thing: Implementations are wrong: Compare coreutils:
where -tc and -c are in LC_CTYPE=C. This is also https://bugs.debian.org/1037048 Compare NetBSD: Compare FreeBSD:
where -tc and -c are in the current LC_CTYPE (this is incompatible with other and historical od -c implementations). Compare OpenBSD: Compare voreutils:
There is a fundamental difference between Actually, why am I telling you, here's what XPG3 says in 1998 (described as "identical to XPG2"): Notice anything? Ah yeah, it's bytes. In 1991 (final drafts, likely much earlier) POSIX 1003.2 defines In 1994, XPG4 (SUSv1) merges into the POSIX usage the XPG3 usage and thus we get just like they thought it was the same as -tc. for good measure since this line couldn't be any more fucked). Note the EX(tension) shading (now we'd call it XSI, or, as 202x Draft 3 politely puts it, so. y'know. it's an extension. considering your program is supposed to be portable to the BSD, you shouldn't be using these extensions, since the BSD is not compliant with the SVID^WX/Open Systems Interfaces). Whereas On that note: how do you expect Oh and also, just to drive this again again again again: Unclear to me where you're getting this. The USG quite clearly says that (a)
vs
). This is why pr has an exemptions list for -e, -i, and -s. |
We use Also, we don't use non-ASCII text in the cases where we do use The only objection I'd have is if Are there systems where EDIT: Oh, you're concerned that |
There are, to my knowledge, no extant od implementations that don't have It's been the baseline standard for 31 years so this is unsurprising. |
Thanks! |
@nicowilliams Now we have a commit that says "od -c => od -tc: od -c is an XSI extension equivalent to LC_CTYPE=C od -tc and not universally available" even though "od -c is equivalent to LC_CTYPE=C od -tc" is nonsense.
You wrote that set of tools.
No. Again, you are misinterpreting the badly worded text of POSIX. In some systems, binary and text files are encoded differently, and in those systems, the POSIX literally says that "bytes shall be interpreted as characters specified by the current setting of the LC_CTYPE locale category" for
Stop fooling. YOU WROTE THAT PAGE based on your incorrect interpretation of the POSIX text. You have nothing, but self-fabricated pages and utilities to suggest a deep difference in behaviour between
YOUR documentation for
Likewise, POSIX says that Again, I was aware that |
I've reverted the commit. |
This reverts commit 0e70f7a. There is no reason to revert this change. In #2922, I only disagreed with the commit message suggesting that LC_CTYPE=C od -t c is equivalent to od -c The only documented differences are that -tc is required to be influenced by -N and -j, while -c is not, and that -c is required to only support a subset of the backslash sequences that -tc should support.
There was no reason to revert this. Removing the use of the XSI extension so tests can run correctly with their non-XSI conformant I was only disagreeing with the claim that I re-reverted the commit. |
Cf. http://ro.ws.co.ls/od#STANDARDS and your favourite POSIX PDF.