Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Linux 6.2 compatibility: Disable -funsigned-char
As of Linux 6.2, -funsigned-char is set by the build system to change how C compilers interpret the char type so that type promotion to an int will always result in a positive number. This differs from the universal meaning of char to be a signed type, such that promotion of a char to an int will sign extend the value to ensure that negative values stay negative. Linux is doing this to workaround bugs involving extended ASCII handling. Extended ASCII is supposed to use the range from [0, 255], but sloppy code that places it into a char makes that into the range [-128, 127]. This by itself is fine, but when the bit width of a char containing extended ASCII is widened to an int, the value remains in the range [-128, 127] while code for processing it in an int only works when it is in the range [0, 255]. Instead of finding all affected code and fixing it, Linux has elected to tell the C compiler to treat all `char` values as unsigned, which fixes the buggy code, but can break code that relies on negative numbers remaining negative when widened from a char to an int. I used Clang Tidy to identify all places in which we have code widen from a char to an int via its poorly named bugprone-signed-char-misuse check. At the time of this patch, all instances of a char being widened are at the following locations: module/lua/lstrlib.c @ Line 342 module/lua/lstrlib.c @ Line 343 module/nvpair/nvpair.c @ Line 2630 module/nvpair/nvpair.c @ Line 2631 module/unicode/u8_textprep.c @ Line 368 module/unicode/u8_textprep.c @ Line 584 module/unicode/u8_textprep.c @ Line 621 module/unicode/u8_textprep.c @ Line 1289 module/unicode/u8_textprep.c @ Line 1424 module/unicode/u8_textprep.c @ Line 1492 module/unicode/u8_textprep.c @ Line 1518 module/unicode/u8_textprep.c @ Line 1619 module/unicode/u8_textprep.c @ Line 1984 module/zfs/dsl_deleg.c @ Line 138 Visual inspection confirms that all of them handle how C does type promotion correctly, with one exception: module/unicode/u8_textprep.c @ Line 1289 Here, we have: size = u8_number_of_bytes[*p]; if (size <= 1 || (p + size) > oslast) break; In that case, the type promotion is from char to size_t, which is unsigned. C will still sign extend as part of the widening before treating the value as unsigned and the negative values we can counter are error values from U8_ILLEGAL_CHAR and U8_OUT_OF_RANGE_CHAR, which are -1 and -2 respectively. The unsigned versions of these under two's complement are SIZE_MAX and SIZE_MAX-1 respectively. The bounds check is written under the assumption that `size <= 1` does a signed comparison. This is followed by a pointer comparison to see if the string has the correct length, which is fine. A little further down we have: for (i = 0; i < size; i++) tc[i] = *p++; When an error condition is encountered, this will attempt to iterate at least SIZE_MAX-1 times, which will massively overflow the buffer, which is not fine. This would undoubtedly cause horrible memory corruption, which would likely crash the system, but could cause just about anything until the system crashes. That said, it is desireable to preserve the existing behavior, lest we have even more bugs, so we disallow the code to be built as a kernel module with -funsigned-char on Linux. This also prevents us from having to worry about the behavior of C compilers differing between user space and kernel space, or across platforms. A patch to fix the bug identified in module/unicode/u8_textprep.c will be written separately. Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
- Loading branch information