Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM 18 breaks 128-bit integers in the interpreter #14832

Open
straight-shoota opened this issue Jul 25, 2024 · 7 comments · Fixed by #14843
Open

LLVM 18 breaks 128-bit integers in the interpreter #14832

straight-shoota opened this issue Jul 25, 2024 · 7 comments · Fixed by #14843
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. kind:regression Something that used to correctly work but no longer works topic:compiler:interpreter topic:compiler:semantic

Comments

@straight-shoota
Copy link
Member

straight-shoota commented Jul 25, 2024

An interpreter built with a LLVM 18 compiler crashes with invalid memory access when using a 128-bit integer type.

The reproduction requires a compiler with LLVM 18 which builds a compiler with interpreter. Using a 128-bit integer value in that interpreter causes an invalid memory access:

$ crystal --version
Crystal 1.13.1 (2024-07-12)

LLVM: 18.1.8
Default target: x86_64-unknown-linux-gn
$ make crystal interpreter=1 release=1
Using /usr/bin/llvm-config-18 [version=18.1.3]
CRYSTAL_CONFIG_BUILD_COMMIT="" CRYSTAL_CONFIG_PATH='$ORIGIN/../share/crystal/src' SOURCE_DATE_EPOCH="1720742400" CC="cc -fuse-ld=lld" CRYSTAL
_CONFIG_LIBRARY_PATH='$ORIGIN/../lib/crystal' ./bin/crystal build -D strict_multi_assign -D preview_overload_order --release  -o .build/cryst
al src/compiler/crystal.cr -D without_openssl -D without_zlib -D use_pcre2
$ bin/crystal i
Using compiled compiler at .build/crystal
Crystal interpreter 1.13.1 (2024-07-12).
EXPERIMENTAL SOFTWARE: if you find a bug, please consider opening an issue in
https://github.com/crystal-lang/crystal/issues/new/
icr:1> 1_i128
 => Invalid memory access (signal 11) at address 0x0
[0x55c5918c6629] ?? +94307038815785 in /app/.build/crystal
[0x55c5918c65e0] ?? +94307038815712 in /app/.build/crystal
[0x7f83b0326320] ?? +140203573535520 in /lib/x86_64-linux-gnu/libc.so.6
[0x55c591f72740] ?? +94307045812032 in /app/.build/crystal
[0x55c591d8ca5b] ?? +94307043822171 in /app/.build/crystal
[0x55c591f981f6] ?? +94307045966326 in /app/.build/crystal
[0x55c59235c83d] ?? +94307049916477 in /app/.build/crystal
[0x55c5923591cc] ?? +94307049902540 in /app/.build/crystal
[0x55c5922b34e5] ?? +94307049223397 in /app/.build/crystal
[0x55c59186773f] __crystal_main +36271 in /app/.build/crystal
[0x55c59186ddc9] main +73 in /app/.build/crystal
[0x7f83b030b1ca] ?? +140203573424586 in /lib/x86_64-linux-gnu/libc.so.6
[0x7f83b030b28b] __libc_start_main +139 in /lib/x86_64-linux-gnu/libc.so.6
[0x55c59185e8c5] _start +37 in /app/.build/crystal
[0x0] ???

The bug does not reproduce when building the interpreter in debug mode, nor in non-release mode.

This is likely related to LLVM changing the alignment of 128-bit integer types to 128 bits in LLVM 18: https://reviews.llvm.org/D86310

  • When Crystal is built with LLVM < 18: alignof(UInt128) == 8.
  • When built with LLVM >= 18: alignof(UInt128) == 16.

The interpreter may expect the old alignment somewhere.

So far it has only materialized in the interpreter and it's quite likely that this is an interpreter-specific bug.
But at this point we cannot certainly exclude any effect on other applications.

This issue appeared while trying to update the previous Crystal release to 1.13.1: #14810

@straight-shoota straight-shoota added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:semantic kind:regression Something that used to correctly work but no longer works labels Jul 25, 2024
@ysbaddaden
Copy link
Contributor

This method is suspicious; does it calculate the alignment on a 8 bytes boundary?

https://github.com/crystal-lang/crystal/blob/master/src/compiler/crystal/interpreter/context.cr#L455-L462

@straight-shoota
Copy link
Member Author

straight-shoota commented Jul 25, 2024

Yeah, looks like it. I've stumbled over that as well.
Although for size 16 it returns 16, so it probably doesn't matter for this issue...

@straight-shoota
Copy link
Member Author

Ah, no that indeed is a problem. align is solely based on size, not on the current stack pointer.
It needs to take the current stack pointer into account. If the address is not aligned to 16 bytes, we won't be able to get a 16-byte alignment.
The interpreter stack does not seem to anticipate an alignment larger than 8 bytes.

@HertzDevil
Copy link
Contributor

IIRC the interpreter memcpys the stack contents rather than dereferencing the stack pointer directly, so if there is an aligned 128-bit load/store at a misaligned address, it is probably somewhere else

@straight-shoota
Copy link
Member Author

stack_push and stack_pop use pointer dereferencing.
Or do you mean something else?

@crysbot
Copy link

crysbot commented Aug 1, 2024

This issue has been mentioned on Crystal Forum. There might be relevant details there:

https://forum.crystal-lang.org/t/invalid-memory-access-signal-11-on-1-13-1-not-present-on-1-12-2/7051/1

@straight-shoota
Copy link
Member Author

#14843 seems to fix the memory access error. However, it does so by simply avoiding misaligned stack access errors. However, the misalignment still exists.

So it does feels a bit like a workaround. It does not fix the underlying issue that 128-bit integers are not properly aligned.

# interpreter with patch #14843 and LLVM 18
pad = 1_i8
x = 1_i128
pointerof(x).address % alignof(typeof(x)) # => 8

Note: This issue only depends on the LLVM version of the interpreter itself, not of the parent compiler. The interpreter's LLVM version determines the alignment value.

If the alignmed was correct, we would not need #14843 to make the interpreter work.

It's probably still a good idea to merge #14843, though. It removes a dependency on the alignment policy of the parent compiler. This seems like a good idea anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. kind:regression Something that used to correctly work but no longer works topic:compiler:interpreter topic:compiler:semantic
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants