cpython 0.6.0 SIGSEGVs and test failures on s390x / System Z (big endian) #265

decathorpe · 2021-07-31T10:28:14Z

I tried updating the Fedora Linux packages for rust-cpython from version 0.5.2 to 0.6.0, but there are a lot of new issues with test failures on s390x architecture (the only big-endian architecture we have access to).

With python 3.9, there are these new test failures compared to 0.5.2:

failures:
---- objectprotocol::test::test_debug_string stdout ----
thread 'objectprotocol::test::test_debug_string' panicked at 'Unknown PyUnicode_KIND', src/objects/string.rs:308:22
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
---- objectprotocol::test::test_display_string stdout ----
thread 'objectprotocol::test::test_display_string' panicked at 'Unknown PyUnicode_KIND', src/objects/string.rs:308:22
---- objects::string::test::test_extract_lone_surrogate stdout ----
thread 'objects::string::test::test_extract_lone_surrogate' panicked at 'Unknown PyUnicode_KIND', src/objects/string.rs:308:22
---- objects::string::test::test_extract_lone_surrogate_lossy stdout ----
thread 'objects::string::test::test_extract_lone_surrogate_lossy' panicked at 'Unknown PyUnicode_KIND', src/objects/string.rs:308:22
---- objects::string::test::test_extract_umlaut stdout ----
thread 'objects::string::test::test_extract_umlaut' panicked at 'Unknown PyUnicode_KIND', src/objects/string.rs:308:22
failures:
    objectprotocol::test::test_debug_string
    objectprotocol::test::test_display_string
    objects::num::test::test_u64_max
    objects::string::test::test_extract_lone_surrogate
    objects::string::test::test_extract_lone_surrogate_lossy
    objects::string::test::test_extract_umlaut
test result: FAILED. 81 passed; 6 failed; 0 ignored; 0 measured; 4 filtered out; finished in 0.01s

With Python 3.10, there are additional failures, these two I have already investigated and traced back to changes in Python 3.10 itself:

objects::num::test::float_to_* unit tests fail since floats are no longer valid initializers for integers in Python 3.10:
https://bugs.python.org/issue37999
https://docs.python.org/3.10/whatsnew/3.10.html#other-language-changes
src/objects/capsule.rs doctests fail since the "private" unicodedata.ucnhash_CAPI symbol was removed from Python 3.10:
https://bugs.python.org/issue44418

These two tests fail too, but I do not get output from cargo:

test objectprotocol::test::test_debug_string ... FAILED
test objectprotocol::test::test_display_string ... FAILED

Because it looks like the test threads / processes crash with segmentation faults:

error: test failed.
Caused by:
  process didn't exit successfully: `./target/release/deps/cpython-abe96c0dffd595a2 --skip 'objects::num::test::float_to_' --skip src/objects/capsule.rs` (signal: 11, SIGSEGV: invalid memory reference)
  process didn't exit successfully: `./target/release/deps/test_class-d070ffed372516e4 --skip 'objects::num::test::float_to_' --skip src/objects/capsule.rs` (signal: 11, SIGSEGV: invalid memory reference)
  process didn't exit successfully: `./target/release/deps/test_sharedref-2ff6a3943eb85a79 --skip 'objects::num::test::float_to_' --skip src/objects/capsule.rs` (signal: 11, SIGSEGV: invalid memory reference)

For now, I think I'll have to skip running the test suite on s390x entirely.

The text was updated successfully, but these errors were encountered:

markbt · 2021-08-11T12:43:29Z

Thanks for the report!

The PyUnicode_KIND errors look concerning. It seems kind is a bitfield, so it must be packed differently so that the Rust version of PyUnicode_KIND is wrong on this platform. I don't have access to a big-endian platform to try things out on, but it would be good to work out a safe way to access C bitfields here.

decathorpe · 2021-08-11T13:39:48Z

I only have very limited access to s390x outside our automated build environment myself, but the error can also be reproduced in emulated s390x machines in QEMU. It's slow, but I can do that on my own machine. So if you need me to test some things, I can do that. But with stuff like mapping C bitfields to Rust, I'm way out of my depth ... maybe this could be a good question to ask on users.rust-lang.org?

decathorpe · 2021-08-23T16:33:28Z

I think I figured out the problem while looking at a very similar failure in pyo3.

There's bitshift and bitwise-and operations happening here:
https://github.com/dgrunwald/rust-cpython/blob/master/python3-sys/src/unicodeobject.rs#L497

Since the byte order of that u32 is different on big-endian vs. little-endian architectures, the result of this function is wrong on big-endian architectures. This bitmask operation probably needs to be done more generically, or with two different code paths for big/little endian architectures ...

dgrunwald · 2021-09-01T20:13:33Z

Both C and Rust use a u32 to access that field, so they should both use the same byte order.
I guess the problem is rather that s390x uses a different bit order within bitfields.

I have no idea how to access a C bitfield in a portable manner from Rust.

Python has three ways of obtaining an integer from an object: 1) directly calling `PyLong_AsLong` * uses `__index__` if available * in Python 3.9 and ealier, but not in 3.10, uses `__int__` if `__index__` is unavailable * throws `TypeError` if the conversion is not possible 2) using `PyNumber_Long` * uses `__int__` * may throw other exceptions like `ValueError`, e.g. when converting from empty string 3) using `PyNumber_Index` * always uses `__index__` with any Python version * throws `TypeError` if the conversion is not possible Previously, the behavior of rust-python was inconsistent: small integer types (e.g. `i32`) used approach 1; where as larger integer types (e.g. `i64`) used approach 2. This this commit, all integer types consistently use approach 3. This is a breaking change: extension methods declared with `arg: i32` parameters no longer accept Python floating-point values. This fixes `objects::num::test::float_to_*` test failures with Python 3.10-rc1 (#265).

Python has three ways of obtaining an integer from an object: 1) directly calling `PyLong_AsLong` * uses `__index__` if available * in Python 3.9 and ealier, but not in 3.10, uses `__int__` if `__index__` is unavailable * throws `TypeError` if the conversion is not possible 2) using `PyNumber_Long` * uses `__int__` * may throw other exceptions like `ValueError`, e.g. when converting from empty string 3) using `PyNumber_Index` * always uses `__index__` with any Python version * throws `TypeError` if the conversion is not possible Previously, the behavior of rust-cpython was inconsistent: small integer types (e.g. `i32`) used approach 1; where as larger integer types (e.g. `i64`) used approach 2. With this commit, all integer types consistently use approach 3. This is a breaking change: extension methods declared with `arg: i32` parameters no longer accept Python floating-point values. This fixes `objects::num::test::float_to_*` test failures with Python 3.10-rc1 (#265).

decathorpe · 2023-07-13T16:21:52Z

I was recently able to fix this in PyO3 with this PR: PyO3/pyo3#3015

You might be able to do something similar here.

decathorpe mentioned this issue Aug 23, 2021

ffi::cpython::unicodeobject::tests::{ascii,ucs4} unit tests segfault on s390x PyO3/pyo3#1824

Closed

dgrunwald mentioned this issue Sep 1, 2021

Use PyNumber_Index when converting objects to integers. #270

Merged

decathorpe mentioned this issue Feb 10, 2023

Unit tests segfault on s390x ijl/orjson#338

Closed

decathorpe mentioned this issue Jul 13, 2023

Python 3.12 support #294

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cpython 0.6.0 SIGSEGVs and test failures on s390x / System Z (big endian) #265

cpython 0.6.0 SIGSEGVs and test failures on s390x / System Z (big endian) #265

decathorpe commented Jul 31, 2021

markbt commented Aug 11, 2021

decathorpe commented Aug 11, 2021

decathorpe commented Aug 23, 2021

dgrunwald commented Sep 1, 2021

decathorpe commented Jul 13, 2023

cpython 0.6.0 SIGSEGVs and test failures on s390x / System Z (big endian) #265

cpython 0.6.0 SIGSEGVs and test failures on s390x / System Z (big endian) #265

Comments

decathorpe commented Jul 31, 2021

markbt commented Aug 11, 2021

decathorpe commented Aug 11, 2021

decathorpe commented Aug 23, 2021

dgrunwald commented Sep 1, 2021

decathorpe commented Jul 13, 2023